Problem Statement¶
Business Context¶
Renewable energy sources play an increasingly important role in the global energy mix, as the effort to reduce the environmental impact of energy production increases.
Out of all the renewable energy alternatives, wind energy is one of the most developed technologies worldwide. The U.S Department of Energy has put together a guide to achieving operational efficiency using predictive maintenance practices.
Predictive maintenance uses sensor information and analysis methods to measure and predict degradation and future component capability. The idea behind predictive maintenance is that failure patterns are predictable and if component failure can be predicted accurately and the component is replaced before it fails, the costs of operation and maintenance will be much lower.
The sensors fitted across different machines involved in the process of energy generation collect data related to various environmental factors (temperature, humidity, wind speed, etc.) and additional features related to various parts of the wind turbine (gearbox, tower, blades, break, etc.).
Objective¶
“ReneWind” is a company working on improving the machinery/processes involved in the production of wind energy using machine learning and has collected data of generator failure of wind turbines using sensors. They have shared a ciphered version of the data, as the data collected through sensors is confidential (the type of data collected varies with companies). Data has 40 predictors, 20000 observations in the training set and 5000 in the test set.
The objective is to build various classification models, tune them, and find the best one that will help identify failures so that the generators could be repaired before failing/breaking to reduce the overall maintenance cost. The nature of predictions made by the classification model will translate as follows:
- True positives (TP) are failures correctly predicted by the model. These will result in repairing costs.
- False negatives (FN) are real failures where there is no detection by the model. These will result in replacement costs.
- False positives (FP) are detections where there is no failure. These will result in inspection costs.
It is given that the cost of repairing a generator is much less than the cost of replacing it, and the cost of inspection is less than the cost of repair.
“1” in the target variables should be considered as “failure” and “0” represents “No failure”.
Data Description¶
The data provided is a transformed version of the original data which was collected using sensors.
- Train.csv - To be used for training and tuning of models.
- Test.csv - To be used only for testing the performance of the final best model.
Both the datasets consist of 40 predictor variables and 1 target variable.
Installing and Importing the necessary libraries¶
# Installing the libraries with the specified version
!pip install --no-deps tensorflow==2.18.0 scikit-learn==1.3.2 matplotlib===3.8.3 seaborn==0.13.2 numpy==1.26.4 pandas==2.2.2 -q --user --no-warn-script-location
Note:
- After running the above cell, kindly restart the runtime (for Google Colab) or notebook kernel (for Jupyter Notebook), and run all cells sequentially from the next cell.
- On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in *this notebook*.
import pandas as pd # Library for data manipulation and analysis.
import numpy as np # Fundamental package for scientific computing.
import matplotlib.pyplot as plt # Plotting library for creating visualizations.
import seaborn as sns #For advanced visualizations.
from sklearn.impute import KNNImputer
from sklearn.model_selection import train_test_split # Function for splitting datasets for training and testing.
#Imports functions for evaluating the performance of machine learning models
from sklearn.metrics import confusion_matrix, f1_score,accuracy_score, recall_score, precision_score, classification_report
#Imports metrics from
from sklearn import metrics
import time # Module for time-related operations.
import tensorflow as tf #An end-to-end open source machine learning platform
from tensorflow import keras # High-level neural networks API for deep learning.
from keras import backend # Abstraction layer for neural network backend engines.
from keras.models import Sequential # Model for building NN sequentially.
from keras.layers import Dense,Dropout,BatchNormalization # for creating fully connected neural network layers.
Loading the Data¶
# uncomment and run the following lines for Google Colab
# from google.colab import drive
# drive.mount('/content/drive')
df_Train = pd.read_csv('Data/Train.csv') # Train Dataset
df_Test = pd.read_csv('Data/Test.csv') # Test Dataset
pd.set_option('display.width', 300)
Data Overview¶
print(df_Train.head())
print(df_Test.head())
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 ... V32 V33 V34 V35 V36 V37 V38 V39 V40 Target
0 -4.464606 -4.679129 3.101546 0.506130 -0.221083 -2.032511 -2.910870 0.050714 -1.522351 3.761892 ... 3.059700 -1.690440 2.846296 2.235198 6.667486 0.443809 -2.369169 2.950578 -3.480324 0
1 3.365912 3.653381 0.909671 -1.367528 0.332016 2.358938 0.732600 -4.332135 0.565695 -0.101080 ... -1.795474 3.032780 -2.467514 1.894599 -2.297780 -1.731048 5.908837 -0.386345 0.616242 0
2 -3.831843 -5.824444 0.634031 -2.418815 -1.773827 1.016824 -2.098941 -3.173204 -2.081860 5.392621 ... -0.257101 0.803550 4.086219 2.292138 5.360850 0.351993 2.940021 3.839160 -4.309402 0
3 1.618098 1.888342 7.046143 -1.147285 0.083080 -1.529780 0.207309 -2.493629 0.344926 2.118578 ... -3.584425 -2.577474 1.363769 0.622714 5.550100 -1.526796 0.138853 3.101430 -1.277378 0
4 -0.111440 3.872488 -3.758361 -2.982897 3.792714 0.544960 0.205433 4.848994 -1.854920 -6.220023 ... 8.265896 6.629213 -10.068689 1.222987 -3.229763 1.686909 -2.163896 -3.644622 6.510338 0
[5 rows x 41 columns]
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 ... V32 V33 V34 V35 V36 V37 V38 V39 V40 Target
0 -0.613489 -3.819640 2.202302 1.300420 -1.184929 -4.495964 -1.835817 4.722989 1.206140 -0.341909 ... 2.291204 -5.411388 0.870073 0.574479 4.157191 1.428093 -10.511342 0.454664 -1.448363 0
1 0.389608 -0.512341 0.527053 -2.576776 -1.016766 2.235112 -0.441301 -4.405744 -0.332869 1.966794 ... -2.474936 2.493582 0.315165 2.059288 0.683859 -0.485452 5.128350 1.720744 -1.488235 0
2 -0.874861 -0.640632 4.084202 -1.590454 0.525855 -1.957592 -0.695367 1.347309 -1.732348 0.466500 ... -1.318888 -2.997464 0.459664 0.619774 5.631504 1.323512 -1.752154 1.808302 1.675748 0
3 0.238384 1.458607 4.014528 2.534478 1.196987 -3.117330 -0.924035 0.269493 1.322436 0.702345 ... 3.517918 -3.074085 -0.284220 0.954576 3.029331 -1.367198 -3.412140 0.906000 -2.450889 0
4 5.828225 2.768260 -1.234530 2.809264 -1.641648 -1.406698 0.568643 0.965043 1.918379 -2.774855 ... 1.773841 -1.501573 -2.226702 4.776830 -6.559698 -0.805551 -0.276007 -3.858207 -0.537694 0
[5 rows x 41 columns]
#overview of given train and test dataset for sanity and efficient model creation
data = {'Category':['Train','Test'],
'Shape (Row * Col)': [df_Train.shape,df_Test.shape ],
'Contains Null':[df_Train.isnull().sum().sum(),df_Test.isnull().sum().sum()],
'Column names having Null values': [df_Train.columns[df_Train.isnull().any()].tolist(),df_Test.columns[df_Test.isnull().any()].tolist()]
}
df_overview = pd.DataFrame(data)
df_overview
| Category | Shape (Row * Col) | Contains Null | Column names having Null values | |
|---|---|---|---|---|
| 0 | Train | (20000, 41) | 36 | [V1, V2] |
| 1 | Test | (5000, 41) | 11 | [V1, V2] |
df_Train.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 20000 entries, 0 to 19999 Data columns (total 41 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 V1 19982 non-null float64 1 V2 19982 non-null float64 2 V3 20000 non-null float64 3 V4 20000 non-null float64 4 V5 20000 non-null float64 5 V6 20000 non-null float64 6 V7 20000 non-null float64 7 V8 20000 non-null float64 8 V9 20000 non-null float64 9 V10 20000 non-null float64 10 V11 20000 non-null float64 11 V12 20000 non-null float64 12 V13 20000 non-null float64 13 V14 20000 non-null float64 14 V15 20000 non-null float64 15 V16 20000 non-null float64 16 V17 20000 non-null float64 17 V18 20000 non-null float64 18 V19 20000 non-null float64 19 V20 20000 non-null float64 20 V21 20000 non-null float64 21 V22 20000 non-null float64 22 V23 20000 non-null float64 23 V24 20000 non-null float64 24 V25 20000 non-null float64 25 V26 20000 non-null float64 26 V27 20000 non-null float64 27 V28 20000 non-null float64 28 V29 20000 non-null float64 29 V30 20000 non-null float64 30 V31 20000 non-null float64 31 V32 20000 non-null float64 32 V33 20000 non-null float64 33 V34 20000 non-null float64 34 V35 20000 non-null float64 35 V36 20000 non-null float64 36 V37 20000 non-null float64 37 V38 20000 non-null float64 38 V39 20000 non-null float64 39 V40 20000 non-null float64 40 Target 20000 non-null int64 dtypes: float64(40), int64(1) memory usage: 6.3 MB
df_Test.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 5000 entries, 0 to 4999 Data columns (total 41 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 V1 4995 non-null float64 1 V2 4994 non-null float64 2 V3 5000 non-null float64 3 V4 5000 non-null float64 4 V5 5000 non-null float64 5 V6 5000 non-null float64 6 V7 5000 non-null float64 7 V8 5000 non-null float64 8 V9 5000 non-null float64 9 V10 5000 non-null float64 10 V11 5000 non-null float64 11 V12 5000 non-null float64 12 V13 5000 non-null float64 13 V14 5000 non-null float64 14 V15 5000 non-null float64 15 V16 5000 non-null float64 16 V17 5000 non-null float64 17 V18 5000 non-null float64 18 V19 5000 non-null float64 19 V20 5000 non-null float64 20 V21 5000 non-null float64 21 V22 5000 non-null float64 22 V23 5000 non-null float64 23 V24 5000 non-null float64 24 V25 5000 non-null float64 25 V26 5000 non-null float64 26 V27 5000 non-null float64 27 V28 5000 non-null float64 28 V29 5000 non-null float64 29 V30 5000 non-null float64 30 V31 5000 non-null float64 31 V32 5000 non-null float64 32 V33 5000 non-null float64 33 V34 5000 non-null float64 34 V35 5000 non-null float64 35 V36 5000 non-null float64 36 V37 5000 non-null float64 37 V38 5000 non-null float64 38 V39 5000 non-null float64 39 V40 5000 non-null float64 40 Target 5000 non-null int64 dtypes: float64(40), int64(1) memory usage: 1.6 MB
df_Train.describe().T
| count | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|
| V1 | 19982.0 | -0.271996 | 3.441625 | -11.876451 | -2.737146 | -0.747917 | 1.840112 | 15.493002 |
| V2 | 19982.0 | 0.440430 | 3.150784 | -12.319951 | -1.640674 | 0.471536 | 2.543967 | 13.089269 |
| V3 | 20000.0 | 2.484699 | 3.388963 | -10.708139 | 0.206860 | 2.255786 | 4.566165 | 17.090919 |
| V4 | 20000.0 | -0.083152 | 3.431595 | -15.082052 | -2.347660 | -0.135241 | 2.130615 | 13.236381 |
| V5 | 20000.0 | -0.053752 | 2.104801 | -8.603361 | -1.535607 | -0.101952 | 1.340480 | 8.133797 |
| V6 | 20000.0 | -0.995443 | 2.040970 | -10.227147 | -2.347238 | -1.000515 | 0.380330 | 6.975847 |
| V7 | 20000.0 | -0.879325 | 1.761626 | -7.949681 | -2.030926 | -0.917179 | 0.223695 | 8.006091 |
| V8 | 20000.0 | -0.548195 | 3.295756 | -15.657561 | -2.642665 | -0.389085 | 1.722965 | 11.679495 |
| V9 | 20000.0 | -0.016808 | 2.160568 | -8.596313 | -1.494973 | -0.067597 | 1.409203 | 8.137580 |
| V10 | 20000.0 | -0.012998 | 2.193201 | -9.853957 | -1.411212 | 0.100973 | 1.477045 | 8.108472 |
| V11 | 20000.0 | -1.895393 | 3.124322 | -14.832058 | -3.922404 | -1.921237 | 0.118906 | 11.826433 |
| V12 | 20000.0 | 1.604825 | 2.930454 | -12.948007 | -0.396514 | 1.507841 | 3.571454 | 15.080698 |
| V13 | 20000.0 | 1.580486 | 2.874658 | -13.228247 | -0.223545 | 1.637185 | 3.459886 | 15.419616 |
| V14 | 20000.0 | -0.950632 | 1.789651 | -7.738593 | -2.170741 | -0.957163 | 0.270677 | 5.670664 |
| V15 | 20000.0 | -2.414993 | 3.354974 | -16.416606 | -4.415322 | -2.382617 | -0.359052 | 12.246455 |
| V16 | 20000.0 | -2.925225 | 4.221717 | -20.374158 | -5.634240 | -2.682705 | -0.095046 | 13.583212 |
| V17 | 20000.0 | -0.134261 | 3.345462 | -14.091184 | -2.215611 | -0.014580 | 2.068751 | 16.756432 |
| V18 | 20000.0 | 1.189347 | 2.592276 | -11.643994 | -0.403917 | 0.883398 | 2.571770 | 13.179863 |
| V19 | 20000.0 | 1.181808 | 3.396925 | -13.491784 | -1.050168 | 1.279061 | 3.493299 | 13.237742 |
| V20 | 20000.0 | 0.023608 | 3.669477 | -13.922659 | -2.432953 | 0.033415 | 2.512372 | 16.052339 |
| V21 | 20000.0 | -3.611252 | 3.567690 | -17.956231 | -5.930360 | -3.532888 | -1.265884 | 13.840473 |
| V22 | 20000.0 | 0.951835 | 1.651547 | -10.122095 | -0.118127 | 0.974687 | 2.025594 | 7.409856 |
| V23 | 20000.0 | -0.366116 | 4.031860 | -14.866128 | -3.098756 | -0.262093 | 2.451750 | 14.458734 |
| V24 | 20000.0 | 1.134389 | 3.912069 | -16.387147 | -1.468062 | 0.969048 | 3.545975 | 17.163291 |
| V25 | 20000.0 | -0.002186 | 2.016740 | -8.228266 | -1.365178 | 0.025050 | 1.397112 | 8.223389 |
| V26 | 20000.0 | 1.873785 | 3.435137 | -11.834271 | -0.337863 | 1.950531 | 4.130037 | 16.836410 |
| V27 | 20000.0 | -0.612413 | 4.368847 | -14.904939 | -3.652323 | -0.884894 | 2.189177 | 17.560404 |
| V28 | 20000.0 | -0.883218 | 1.917713 | -9.269489 | -2.171218 | -0.891073 | 0.375884 | 6.527643 |
| V29 | 20000.0 | -0.985625 | 2.684365 | -12.579469 | -2.787443 | -1.176181 | 0.629773 | 10.722055 |
| V30 | 20000.0 | -0.015534 | 3.005258 | -14.796047 | -1.867114 | 0.184346 | 2.036229 | 12.505812 |
| V31 | 20000.0 | 0.486842 | 3.461384 | -13.722760 | -1.817772 | 0.490304 | 2.730688 | 17.255090 |
| V32 | 20000.0 | 0.303799 | 5.500400 | -19.876502 | -3.420469 | 0.052073 | 3.761722 | 23.633187 |
| V33 | 20000.0 | 0.049825 | 3.575285 | -16.898353 | -2.242857 | -0.066249 | 2.255134 | 16.692486 |
| V34 | 20000.0 | -0.462702 | 3.183841 | -17.985094 | -2.136984 | -0.255008 | 1.436935 | 14.358213 |
| V35 | 20000.0 | 2.229620 | 2.937102 | -15.349803 | 0.336191 | 2.098633 | 4.064358 | 15.291065 |
| V36 | 20000.0 | 1.514809 | 3.800860 | -14.833178 | -0.943809 | 1.566526 | 3.983939 | 19.329576 |
| V37 | 20000.0 | 0.011316 | 1.788165 | -5.478350 | -1.255819 | -0.128435 | 1.175533 | 7.467006 |
| V38 | 20000.0 | -0.344025 | 3.948147 | -17.375002 | -2.987638 | -0.316849 | 2.279399 | 15.289923 |
| V39 | 20000.0 | 0.890653 | 1.753054 | -6.438880 | -0.272250 | 0.919261 | 2.057540 | 7.759877 |
| V40 | 20000.0 | -0.875630 | 3.012155 | -11.023935 | -2.940193 | -0.920806 | 1.119897 | 10.654265 |
| Target | 20000.0 | 0.055500 | 0.228959 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000 |
df_Test.describe().T
| count | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|
| V1 | 4995.0 | -0.277622 | 3.466280 | -12.381696 | -2.743691 | -0.764767 | 1.831313 | 13.504352 |
| V2 | 4994.0 | 0.397928 | 3.139562 | -10.716179 | -1.649211 | 0.427369 | 2.444486 | 14.079073 |
| V3 | 5000.0 | 2.551787 | 3.326607 | -9.237940 | 0.314931 | 2.260428 | 4.587000 | 15.314503 |
| V4 | 5000.0 | -0.048943 | 3.413937 | -14.682446 | -2.292694 | -0.145753 | 2.166468 | 12.140157 |
| V5 | 5000.0 | -0.080120 | 2.110870 | -7.711569 | -1.615238 | -0.131890 | 1.341197 | 7.672835 |
| V6 | 5000.0 | -1.042138 | 2.005444 | -8.924196 | -2.368853 | -1.048571 | 0.307555 | 5.067685 |
| V7 | 5000.0 | -0.907922 | 1.769017 | -8.124230 | -2.054259 | -0.939695 | 0.212228 | 7.616182 |
| V8 | 5000.0 | -0.574592 | 3.331911 | -12.252731 | -2.642088 | -0.357943 | 1.712896 | 10.414722 |
| V9 | 5000.0 | 0.030121 | 2.174139 | -6.785495 | -1.455712 | -0.079891 | 1.449548 | 8.850720 |
| V10 | 5000.0 | 0.018524 | 2.145437 | -8.170956 | -1.353320 | 0.166292 | 1.511248 | 6.598728 |
| V11 | 5000.0 | -2.008615 | 3.112220 | -13.151753 | -4.050432 | -2.043122 | 0.044069 | 9.956400 |
| V12 | 5000.0 | 1.576413 | 2.907401 | -8.164048 | -0.449674 | 1.488253 | 3.562626 | 12.983644 |
| V13 | 5000.0 | 1.622456 | 2.882892 | -11.548209 | -0.126012 | 1.718649 | 3.464604 | 12.620041 |
| V14 | 5000.0 | -0.921097 | 1.803470 | -7.813929 | -2.110952 | -0.896011 | 0.272324 | 5.734112 |
| V15 | 5000.0 | -2.452174 | 3.387041 | -15.285768 | -4.479072 | -2.417131 | -0.432943 | 11.673420 |
| V16 | 5000.0 | -3.018503 | 4.264407 | -20.985779 | -5.648343 | -2.773763 | -0.178105 | 13.975843 |
| V17 | 5000.0 | -0.103721 | 3.336513 | -13.418281 | -2.227683 | 0.047462 | 2.111907 | 19.776592 |
| V18 | 5000.0 | 1.195606 | 2.586403 | -12.214016 | -0.408850 | 0.881395 | 2.604014 | 13.642235 |
| V19 | 5000.0 | 1.210490 | 3.384662 | -14.169635 | -1.026394 | 1.295864 | 3.526278 | 12.427997 |
| V20 | 5000.0 | 0.138429 | 3.657171 | -13.719620 | -2.325454 | 0.193386 | 2.539550 | 13.870565 |
| V21 | 5000.0 | -3.664398 | 3.577841 | -16.340707 | -5.944369 | -3.662870 | -1.329645 | 11.046925 |
| V22 | 5000.0 | 0.961960 | 1.640414 | -6.740239 | -0.047728 | 0.986020 | 2.029321 | 7.505291 |
| V23 | 5000.0 | -0.422182 | 4.056714 | -14.422274 | -3.162690 | -0.279222 | 2.425911 | 13.180887 |
| V24 | 5000.0 | 1.088841 | 3.968207 | -12.315545 | -1.623203 | 0.912815 | 3.537195 | 17.806035 |
| V25 | 5000.0 | 0.061235 | 2.010227 | -6.770139 | -1.298377 | 0.076703 | 1.428491 | 6.556937 |
| V26 | 5000.0 | 1.847261 | 3.400330 | -11.414019 | -0.242470 | 1.917032 | 4.156106 | 17.528193 |
| V27 | 5000.0 | -0.552397 | 4.402947 | -13.177038 | -3.662591 | -0.871982 | 2.247257 | 17.290161 |
| V28 | 5000.0 | -0.867678 | 1.926181 | -7.933388 | -2.159811 | -0.930695 | 0.420587 | 7.415659 |
| V29 | 5000.0 | -1.095805 | 2.655454 | -9.987800 | -2.861373 | -1.340547 | 0.521843 | 14.039466 |
| V30 | 5000.0 | -0.118699 | 3.023292 | -12.438434 | -1.996743 | 0.112463 | 1.946450 | 10.314976 |
| V31 | 5000.0 | 0.468810 | 3.446324 | -11.263271 | -1.822421 | 0.485742 | 2.779008 | 12.558928 |
| V32 | 5000.0 | 0.232567 | 5.585628 | -17.244168 | -3.556267 | -0.076694 | 3.751857 | 26.539391 |
| V33 | 5000.0 | -0.080115 | 3.538624 | -14.903781 | -2.348121 | -0.159713 | 2.099160 | 13.323517 |
| V34 | 5000.0 | -0.392663 | 3.166101 | -14.699725 | -2.009604 | -0.171745 | 1.465402 | 12.146302 |
| V35 | 5000.0 | 2.211205 | 2.948426 | -12.260591 | 0.321818 | 2.111750 | 4.031639 | 13.489237 |
| V36 | 5000.0 | 1.594845 | 3.774970 | -12.735567 | -0.866066 | 1.702964 | 4.104409 | 17.116122 |
| V37 | 5000.0 | 0.022931 | 1.785320 | -5.079070 | -1.240526 | -0.110415 | 1.237522 | 6.809938 |
| V38 | 5000.0 | -0.405659 | 3.968936 | -15.334533 | -2.984480 | -0.381162 | 2.287998 | 13.064950 |
| V39 | 5000.0 | 0.938800 | 1.716502 | -5.451050 | -0.208024 | 0.959152 | 2.130769 | 7.182237 |
| V40 | 5000.0 | -0.932406 | 2.978193 | -10.076234 | -2.986587 | -1.002764 | 1.079738 | 8.698460 |
| Target | 5000.0 | 0.056400 | 0.230716 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000 |
Observation on Data Overview¶
- Total given rows and columns in Train dataset is 20000 and 41 respectively
- Total given rows and columns in Test dataset is 5000 and 41 respectively
- In Train dataset contains missing values in V1 and V2 columns (18 in each column)
- In Test dataset contains missing values in V1 and V2 columns (5 and 6 respectively)
- In both Train and Test dataset all predicted variables columns are of float data types and Target variable is of int datatype
Exploratory Data Analysis¶
from typing import Optional, Tuple
def histogram_boxplot(
data: pd.DataFrame,
feature: str,
figsize: Tuple[int, int] = (12, 7),
kde: bool = False,
bins: Optional[int] = None
) -> None:
"""
Create a combined boxplot and histogram visualization.
Parameters:
-----------
data : pd.DataFrame
Input dataframe containing the data
feature : str
Column name to visualize
figsize : Tuple[int, int], default=(12, 7)
Figure size as (width, height)
kde : bool, default=False
Whether to show kernel density estimation curve
bins : Optional[int], default=None
Number of histogram bins
"""
# Create subplot grid with shared x-axis
fig, (box_ax, hist_ax) = plt.subplots(
nrows = 2,
sharex=True, # Share x-axis between subplots
gridspec_kw={"height_ratios": (0.25, 0.75)},
figsize=figsize
)
# Create boxplot with mean indicator
sns.boxplot(
data=data,
x=feature,
ax=box_ax,
showmeans=True,
color="blue",
)
# Create histogram with conditional bins parameter
if bins:
sns.histplot(
data=data,
x=feature,
kde=kde,
ax=hist_ax,
bins=bins,
palette="magma"
)
else:
sns.histplot(
data=data,
x=feature,
kde=kde,
ax=hist_ax
)
# Add mean line to histogram
hist_ax.axvline(
data[feature].mean(),
color="violet",
linestyle="--"
)
# Add median line to histogram
hist_ax.axvline(
data[feature].median(),
color="black",
linestyle="-"
)
plt.show()
# function to create labeled barplots
def labeled_barplot(data, feature, perc=False, n=None):
"""
Barplot with percentage at the top
data: dataframe
feature: dataframe column
perc: whether to display percentages instead of count (default is False)
n: displays the top n category levels (default is None, i.e., display all levels)
"""
total = len(data[feature]) # length of the column
count = data[feature].nunique()
if n is None:
plt.figure(figsize=(count + 1, 5))
else:
plt.figure(figsize=(n + 1, 5))
plt.xticks(rotation=90, fontsize=15)
ax = sns.countplot(
data=data,
x=feature,
hue=feature,
palette="Paired",
order=data[feature].value_counts().index[:n].sort_values(),
)
for p in ax.patches:
if perc == True:
label = "{:.1f}%".format(
100 * p.get_height() / total
) # percentage of each class of the category
else:
label = p.get_height() # count of each level of the category
x = p.get_x() + p.get_width() / 2 # width of the plot
y = p.get_height() # height of the plot
ax.annotate(
label,
(x, y),
ha="center",
va="center",
size=12,
xytext=(0, 5),
textcoords="offset points",
) # annotate the percentage
plt.show() # show the plot
Univariate analysis¶
# Iterate through each column (feature) in the training dataframe
for feature in df_Train.columns:
# For each feature, create both a histogram and boxplot visualization
# Parameters: dataframe, feature name, figure size (8,6), show plot (True), and no custom bins (None)
histogram_boxplot(df_Train, feature, (8, 4), True, None)
# Extract key observations from the plots
print(f"Observations for {feature}:")
print(f"Distribution shape: {'Normal' if df_Train[feature].skew() < 0.5 else 'Skewed'}")
print(f"Outliers present: {'Yes' if abs(df_Train[feature].max() - df_Train[feature].mean()) > 3*df_Train[feature].std() else 'No'}")
print(f"Mean value: {df_Train[feature].mean():.2f}")
print(f"Data range: {df_Train[feature].min():.2f} to {df_Train[feature].max():.2f}")
Observations for Target: Distribution shape: Skewed Outliers present: Yes Mean value: 0.06 Data range: 0.00 to 1.00
labeled_barplot(df_Train,'Target',perc =True)
labeled_barplot(df_Test,'Target',perc =True)
Observation on Univariate Analysis¶
- As per the above plots ,the predictor variables have little skewness
- Outliers are present in all predictor variables, outlier removal is not required as we don;t have information on columns (which part of the machine )
- Target Variable on Train Dataset - 94.5% of this dataset are having machine failure status , 5.5% with no failure
- Target Variable on Test Dataset - 94.4 % of this dataset are having machine failure status , 5.6% with no failure
Bivariate Analysis¶
#performing Bivariate analysis
cols_list = df_Train.select_dtypes(include=np.number).columns.tolist()
cols_list.remove("Target")
plt.figure(figsize=(30,20))
sns.heatmap(df_Train[cols_list].corr(),annot = True,vmin=-1,vmax =1,fmt = ".2f",cmap = "Spectral")
plt.show()
Observation on Bivariate Analysis¶
Positive Corelation >= 0.8
- V7 with V15
- V8 with V16
- V11 with v29
and some more
Negative Correlation <=0.7
- V2 with V14
- V3 with V23
- V6 with V20 etc
There are other pridictive varibales whose corrleation are positive as well as negaive . Let's try Reg Plt for some of the above mentioned predicted variables
plt.figure(figsize=(8, 6))
sns.regplot(y=df_Train.V7, x=df_Train.V15)
<Axes: xlabel='V15', ylabel='V7'>
Significant positive correlation between V7 and V15
plt.figure(figsize=(8, 6))
sns.regplot(y=df_Train.V2, x=df_Train.V14)
<Axes: xlabel='V14', ylabel='V2'>
Highly Neagitve corelation between v2 and v14
plt.figure(figsize=(8, 6))
sns.regplot(y=df_Train.V20, x=df_Train.V6)
<Axes: xlabel='V6', ylabel='V20'>
Highly Neagitve corelation between v6 and v20
Data Preprocessing¶
##Changing Traget Varibale Datatype from Int to float
df_Train['Target'] = df_Train['Target'].astype(float)
df_Test['Target'] = df_Test['Target'].astype(float)
# Separating dataset for taget variable and spliting in Train ,val and Test ,to avoid data leakage
X = df_Train.drop(['Target'],axis =1)
y = df_Train['Target']
x_train,x_val,y_train,y_val = train_test_split(
X,y,test_size=0.2,random_state=1,stratify=y)
print(x_train.shape)
print(x_val.shape)
print(y_train.shape)
print(y_val.shape)
(16000, 40) (4000, 40) (16000,) (4000,)
# Dividing test data into X_test and y_test
X_test = df_Test.drop(columns = ['Target'] , axis= 1)
y_test = df_Test["Target"]
Treat Missing Value¶
Replacing with median: In this method the missing values are imputed with the median of the column. In cases where the column has outliers, median is an appropriate measure of central tendency to deal with the missing values over mean.
from sklearn.impute import SimpleImputer
imputer = SimpleImputer(strategy="median")
impute = imputer.fit(x_train)
x_train_si = impute.transform(x_train)
x_val_si = imputer.transform(x_val)
x_train = pd.DataFrame(x_train_si,columns=x_train.columns, index=x_train.index)
x_val = pd.DataFrame(x_val_si, columns=x_val.columns, index=x_val.index)
Model Building¶
Model Evaluation Criterion¶
Model Evaluation Criterion is based on the given objective for this use case
Objective: Identify failures to enable pre-failure repairs, reducing replacement costs.
True Positives (TP): Correctly predicted failures (Repairs - lower cost).
False Negatives (FN): Real failures missed by the model (Replacements - HIGHEST COST).
False Positives (FP): Predicted failures that aren't failures (Inspections - lowest cost).
Given that replacement costs (due to False Negatives) are the highest, our primary goal is to minimize False Negatives. This directly translates to maximizing Recall.
Recall Definition: Recall = TP / (TP + FN). A higher recall means fewer actual failures are missed by the model.
Hence, our neural network model should be based on higher Recall
While recall is the priority, precision must be reasonably high to avoid too many false alarms, which would lead to unnecessary inspections and increased operational overhead.
Utility Functions¶
def plot(history, name):
"""
Function to plot loss/accuracy
history: an object which stores the metrics and losses.
name: can be one of Loss or Accuracy
"""
fig, ax = plt.subplots() #Creating a subplot with figure and axes.
plt.plot(history.history[name]) #Plotting the train accuracy or train loss
plt.plot(history.history['val_'+name]) #Plotting the validation accuracy or validation loss
plt.title('Model ' + name.capitalize()) #Defining the title of the plot.
plt.ylabel(name.capitalize()) #Capitalizing the first letter.
plt.xlabel('Epoch') #Defining the label for the x-axis.
fig.legend(['Train', 'Validation'], loc="outside right upper") #Defining the legend, loc controls the position of the legend.
# defining a function to compute different metrics to check performance of a classification model built using statsmodels
def model_performance_classification(
model, predictors, target, threshold=0.5
):
"""
Function to compute different metrics to check classification model performance
model: classifier
predictors: independent variables
target: dependent variable
threshold: threshold for classifying the observation as class 1
"""
# checking which probabilities are greater than threshold
pred = model.predict(predictors) > threshold
# pred_temp = model.predict(predictors) > threshold
# # rounding off the above values to get classes
# pred = np.round(pred_temp)
acc = accuracy_score(target, pred) # to compute Accuracy
recall = recall_score(target, pred, average='macro') # to compute Recall
precision = precision_score(target, pred, average='macro') # to compute Precision
f1 = f1_score(target, pred, average='macro') # to compute F1-score
# creating a dataframe of metrics
df_perf = pd.DataFrame(
{"Accuracy": acc, "Recall": recall, "Precision": precision, "F1 Score": f1,}, index = [0]
)
return df_perf
def make_confusion_matrix(cf,
group_names=None,
categories='auto',
count=True,
percent=True,
cbar=True,
xyticks=True,
xyplotlabels=True,
sum_stats=True,
figsize=None,
cmap='Blues',
title=None):
'''
This function will make a pretty plot of an sklearn Confusion Matrix cm using a Seaborn heatmap visualization.
Arguments
'''
# CODE TO GENERATE TEXT INSIDE EACH SQUARE
blanks = ['' for i in range(cf.size)]
if group_names and len(group_names)==cf.size:
group_labels = ["{}\n".format(value) for value in group_names]
else:
group_labels = blanks
if count:
group_counts = ["{0:0.0f}\n".format(value) for value in cf.flatten()]
else:
group_counts = blanks
if percent:
group_percentages = ["{0:.2%}".format(value) for value in cf.flatten()/np.sum(cf)]
else:
group_percentages = blanks
box_labels = [f"{v1}{v2}{v3}".strip() for v1, v2, v3 in zip(group_labels,group_counts,group_percentages)]
box_labels = np.asarray(box_labels).reshape(cf.shape[0],cf.shape[1])
# CODE TO GENERATE SUMMARY STATISTICS & TEXT FOR SUMMARY STATS
if sum_stats:
#Accuracy is sum of diagonal divided by total observations
accuracy = np.trace(cf) / float(np.sum(cf))
#if it is a binary confusion matrix, show some more stats
if len(cf)==2:
#Metrics for Binary Confusion Matrices
precision = cf[1,1] / sum(cf[:,1])
recall = cf[1,1] / sum(cf[1,:])
f1_score = 2*precision*recall / (precision + recall)
stats_text = "\n\nAccuracy={:0.3f}\nPrecision={:0.3f}\nRecall={:0.3f}\nF1 Score={:0.3f}".format(
accuracy,precision,recall,f1_score)
else:
stats_text = "\n\nAccuracy={:0.3f}".format(accuracy)
else:
stats_text = ""
# SET FIGURE PARAMETERS ACCORDING TO OTHER ARGUMENTS
if figsize==None:
#Get default figure size if not set
figsize = plt.rcParams.get('figure.figsize')
if xyticks==False:
#Do not show categories if xyticks is False
categories=False
# MAKE THE HEATMAP VISUALIZATION
plt.figure(figsize=figsize)
sns.heatmap(cf,annot=box_labels,fmt="",cmap=cmap,cbar=cbar,xticklabels=categories,yticklabels=categories)
if xyplotlabels:
plt.ylabel('True label')
plt.xlabel('Predicted label' + stats_text)
else:
plt.xlabel(stats_text)
if title:
plt.title(title)
#Defining the columns of the dataframe which are nothing but the hyper parameters and the metrics.
columns = ["# hidden layers","# neurons - hidden layer","activation function - hidden layer ","# epochs","batch size","optimizer","learning rate, momentum","weight initializer","regularization","train loss","validation loss","train recall","validation recall","time (secs)"]
#Creating a pandas dataframe.
results = pd.DataFrame(columns=columns)
Initial Model Building (Model 0)¶
- Let's start with a neural network consisting of
- just one hidden layer
- activation function of ReLU
- SGD as the optimizer
#defining batch_size and epochs to be used as same values for all models
batch_size = 32
epochs = 50
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
# initializing the neural network
model_0 = Sequential()
# Adding a dense layer with ReLu activation function, with the input dimension determined by the shape of the training data.
model_0.add(Dense(128, activation = 'relu', input_dim = x_train.shape[1]))
model_0.add(Dense(1,activation = 'sigmoid')) #adding output layer with sigmoid as the output is binary 1 or 0
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/core/dense.py:93: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
model_0.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 128) │ 5,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 1) │ 129 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 5,377 (21.00 KB)
Trainable params: 5,377 (21.00 KB)
Non-trainable params: 0 (0.00 B)
optimizer = keras.optimizers.SGD() # defining SGD as the optimizer to be used
model_0.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["Recall"])
start = time.time()
history = model_0.fit(x_train, y_train, validation_data=(x_val,y_val) , batch_size=batch_size, epochs=epochs)
end=time.time()
Epoch 1/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.2355 - loss: 0.2048 - val_Recall: 0.5991 - val_loss: 0.0991 Epoch 2/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 260us/step - Recall: 0.5939 - loss: 0.0898 - val_Recall: 0.7477 - val_loss: 0.0871 Epoch 3/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 260us/step - Recall: 0.7078 - loss: 0.0739 - val_Recall: 0.7568 - val_loss: 0.0779 Epoch 4/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 260us/step - Recall: 0.7347 - loss: 0.0683 - val_Recall: 0.7568 - val_loss: 0.0749 Epoch 5/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 260us/step - Recall: 0.7788 - loss: 0.0615 - val_Recall: 0.7793 - val_loss: 0.0720 Epoch 6/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 257us/step - Recall: 0.7641 - loss: 0.0622 - val_Recall: 0.7883 - val_loss: 0.0700 Epoch 7/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.7742 - loss: 0.0620 - val_Recall: 0.7973 - val_loss: 0.0682 Epoch 8/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 256us/step - Recall: 0.8101 - loss: 0.0554 - val_Recall: 0.8063 - val_loss: 0.0677 Epoch 9/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 258us/step - Recall: 0.7949 - loss: 0.0597 - val_Recall: 0.7973 - val_loss: 0.0661 Epoch 10/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 258us/step - Recall: 0.8130 - loss: 0.0577 - val_Recall: 0.8018 - val_loss: 0.0668 Epoch 11/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8152 - loss: 0.0582 - val_Recall: 0.8063 - val_loss: 0.0660 Epoch 12/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8387 - loss: 0.0501 - val_Recall: 0.8153 - val_loss: 0.0646 Epoch 13/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 306us/step - Recall: 0.8316 - loss: 0.0526 - val_Recall: 0.8108 - val_loss: 0.0640 Epoch 14/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 260us/step - Recall: 0.8111 - loss: 0.0542 - val_Recall: 0.8153 - val_loss: 0.0641 Epoch 15/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8273 - loss: 0.0478 - val_Recall: 0.8108 - val_loss: 0.0635 Epoch 16/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8344 - loss: 0.0507 - val_Recall: 0.8153 - val_loss: 0.0636 Epoch 17/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8123 - loss: 0.0543 - val_Recall: 0.8198 - val_loss: 0.0630 Epoch 18/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 260us/step - Recall: 0.8406 - loss: 0.0469 - val_Recall: 0.8153 - val_loss: 0.0622 Epoch 19/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 260us/step - Recall: 0.8246 - loss: 0.0515 - val_Recall: 0.8198 - val_loss: 0.0622 Epoch 20/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 260us/step - Recall: 0.8584 - loss: 0.0460 - val_Recall: 0.8198 - val_loss: 0.0623 Epoch 21/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8641 - loss: 0.0444 - val_Recall: 0.8243 - val_loss: 0.0618 Epoch 22/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8286 - loss: 0.0502 - val_Recall: 0.8243 - val_loss: 0.0629 Epoch 23/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8323 - loss: 0.0498 - val_Recall: 0.8243 - val_loss: 0.0615 Epoch 24/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8354 - loss: 0.0509 - val_Recall: 0.8198 - val_loss: 0.0616 Epoch 25/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8384 - loss: 0.0511 - val_Recall: 0.8288 - val_loss: 0.0620 Epoch 26/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 260us/step - Recall: 0.8635 - loss: 0.0463 - val_Recall: 0.8243 - val_loss: 0.0621 Epoch 27/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8417 - loss: 0.0466 - val_Recall: 0.8288 - val_loss: 0.0614 Epoch 28/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8544 - loss: 0.0473 - val_Recall: 0.8198 - val_loss: 0.0625 Epoch 29/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8629 - loss: 0.0399 - val_Recall: 0.8333 - val_loss: 0.0610 Epoch 30/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8521 - loss: 0.0476 - val_Recall: 0.8288 - val_loss: 0.0613 Epoch 31/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8441 - loss: 0.0468 - val_Recall: 0.8243 - val_loss: 0.0612 Epoch 32/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 260us/step - Recall: 0.8637 - loss: 0.0422 - val_Recall: 0.8423 - val_loss: 0.0609 Epoch 33/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 261us/step - Recall: 0.8505 - loss: 0.0426 - val_Recall: 0.8378 - val_loss: 0.0609 Epoch 34/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 260us/step - Recall: 0.8668 - loss: 0.0401 - val_Recall: 0.8514 - val_loss: 0.0609 Epoch 35/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8545 - loss: 0.0456 - val_Recall: 0.8333 - val_loss: 0.0604 Epoch 36/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8376 - loss: 0.0485 - val_Recall: 0.8333 - val_loss: 0.0620 Epoch 37/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8485 - loss: 0.0434 - val_Recall: 0.8378 - val_loss: 0.0611 Epoch 38/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8622 - loss: 0.0402 - val_Recall: 0.8514 - val_loss: 0.0606 Epoch 39/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8643 - loss: 0.0406 - val_Recall: 0.8468 - val_loss: 0.0602 Epoch 40/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8657 - loss: 0.0418 - val_Recall: 0.8378 - val_loss: 0.0608 Epoch 41/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8345 - loss: 0.0493 - val_Recall: 0.8468 - val_loss: 0.0604 Epoch 42/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 258us/step - Recall: 0.8530 - loss: 0.0422 - val_Recall: 0.8423 - val_loss: 0.0607 Epoch 43/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 258us/step - Recall: 0.8613 - loss: 0.0439 - val_Recall: 0.8423 - val_loss: 0.0603 Epoch 44/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 258us/step - Recall: 0.8672 - loss: 0.0401 - val_Recall: 0.8423 - val_loss: 0.0602 Epoch 45/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8623 - loss: 0.0410 - val_Recall: 0.8378 - val_loss: 0.0603 Epoch 46/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8661 - loss: 0.0419 - val_Recall: 0.8514 - val_loss: 0.0606 Epoch 47/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 309us/step - Recall: 0.8708 - loss: 0.0391 - val_Recall: 0.8514 - val_loss: 0.0608 Epoch 48/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 262us/step - Recall: 0.8815 - loss: 0.0373 - val_Recall: 0.8423 - val_loss: 0.0610 Epoch 49/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8732 - loss: 0.0391 - val_Recall: 0.8468 - val_loss: 0.0601 Epoch 50/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step - Recall: 0.8530 - loss: 0.0414 - val_Recall: 0.8514 - val_loss: 0.0599
print("Time taken in seconds ",end-start)
Time taken in seconds 6.92151403427124
plot(history,'loss')
plot(history,'Recall')
Both training and validation recall rise sharply within the first 10 epochs, indicating that the model is learning relevant patterns quickly early in training.
After ~20 epochs, training recall continues to improve slightly, while validation recall plateaus and fluctuates. This indicates minor overfitting
A small gap is visible between training and validation recall, indicating the model may benefit from:
- Dropout or regularization
model_0_train_perf = model_performance_classification(model_0, x_train, y_train) #threshold = 0.5
print("Model 0 Train Performance")
print(model_0_train_perf)
print("-"*50)
print("Model 0 Validation Performance")
model_0_val_perf = model_performance_classification(model_0,x_val,y_val)
print(model_0_val_perf)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 135us/step Model 0 Train Performance Accuracy Recall Precision F1 Score 0 0.992125 0.934884 0.989206 0.960243 -------------------------------------------------- Model 0 Validation Performance 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 155us/step Accuracy Recall Precision F1 Score 0 0.99 0.924749 0.977805 0.949515
#let's get classification report on y train and y val for Model 0
y_train_pred_0 = model_0.predict(x_train)
y_val_pred_0 = model_0.predict(x_val)
print("Classification Report - Train data Model_0",end="\n\n")
cr_train_model_0 = classification_report(y_train,y_train_pred_0>0.5)
print(cr_train_model_0)
print("Classification Report - Validation data Model_0",end="\n\n")
cr_val_model_0 = classification_report(y_val,y_val_pred_0>0.5)
print(cr_val_model_0)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 138us/step 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 159us/step Classification Report - Train data Model_0 precision recall f1-score support 0 0.99 1.00 1.00 15112 1 0.99 0.87 0.92 888 accuracy 0.99 16000 macro avg 0.99 0.93 0.96 16000 weighted avg 0.99 0.99 0.99 16000 Classification Report - Validation data Model_0 precision recall f1-score support 0 0.99 1.00 0.99 3778 1 0.96 0.85 0.90 222 accuracy 0.99 4000 macro avg 0.98 0.92 0.95 4000 weighted avg 0.99 0.99 0.99 4000
# Predict probabilities using confusion matrix
y_pred = model_0.predict(x_val)
for i in range(len(y_val)):
if y_pred[i]>0.5:
y_pred[i]=1
else:
y_pred[i]=0
cm = confusion_matrix(y_val,y_pred)
labels = ['True Negative','False Positive','False Negative','True Positive']
make_confusion_matrix(cm,group_names=labels,cmap = "Blues")
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 148us/step
#assigning last epoch loss and recall for both train and val for quick comparision
results.loc[0] = [1,[128],
["relu"],50,32,["sgd"] ,
[0.01, "-"],"-","-",
history.history["loss"][-1]
,history.history["val_loss"][-1]
,history.history["Recall"][-1]
,history.history["val_Recall"][-1],round(end-start,2)]
results.head(1)
| # hidden layers | # neurons - hidden layer | activation function - hidden layer | # epochs | batch size | optimizer | learning rate, momentum | weight initializer | regularization | train loss | validation loss | train recall | validation recall | time (secs) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [128] | [relu] | 50 | 32 | [sgd] | [0.01, -] | - | - | 0.040534 | 0.05994 | 0.861486 | 0.851351 | 6.92 |
Observations¶
Classification Report - Train data Model_0
High precision (0.99) for failures (class 1): Most of the predicted failures were correct, meaning the model rarely mislabels healthy generators as failed.
Moderate recall (0.88) for failures: 12% of actual failures were missed by the model — room for improvement in catching all failure cases.
Overall accuracy is 99%, but due to class imbalance (many more non-failures than failures), F1-score for class 1 is a better indicator.
Macro average recall (0.93) reflects imbalance — model is slightly less sensitive to the minority class (failures).
Classification Report - Validation data Model_0
Slight drop in recall on validation (0.85) for failures — the model misses more failure cases compared to training (potential mild overfitting).
Precision remains strong (0.97) for failures — false alarms (false positives) are rare.
F1-score of 0.90 for failures is still very good, indicating balanced performance.
Note Consistency in performance across train and validation shows the model generalizes well but can benefit from techniques to boost recall (like class weighting (as the dataset has class imbalance), adding additonal hidden layer, changing optimizer or using Drop out /regularisation). We'll imporve out model accordingly
Model Performance Improvement¶
Model 1¶
Let's add another hidden layer (ReLU activation function and 64 units) with same SGD Optimizer to see how much the model improves upon adding another layer
#clear current keras session
tf.keras.backend.clear_session()
# initializing the neural network
model_1 = Sequential()
# Adding a dense layer with ReLu activation function, with the input dimension determined by the shape of the training data.
model_1.add(Dense(128, activation = 'relu', input_dim = x_train.shape[1]))
model_1.add(Dense(64, activation = 'relu'))
model_1.add(Dense(1,activation = 'sigmoid')) #adding output layer with sigmoid as the output is binary 1 or 0
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/core/dense.py:93: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
model_1.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 128) │ 5,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 64) │ 8,256 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 13,569 (53.00 KB)
Trainable params: 13,569 (53.00 KB)
Non-trainable params: 0 (0.00 B)
#SGD Optiomizer
optimizer = keras.optimizers.SGD() # defining SGD as the optimizer to be used
model_1.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["Recall"])
start = time.time()
history = model_1.fit(x_train, y_train, validation_data=(x_val,y_val) , batch_size=batch_size, epochs=epochs)
end=time.time()
Epoch 1/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 404us/step - Recall: 0.3526 - loss: 0.1638 - val_Recall: 0.6622 - val_loss: 0.0894 Epoch 2/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 290us/step - Recall: 0.6919 - loss: 0.0810 - val_Recall: 0.7477 - val_loss: 0.0772 Epoch 3/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 288us/step - Recall: 0.7575 - loss: 0.0723 - val_Recall: 0.7658 - val_loss: 0.0710 Epoch 4/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 289us/step - Recall: 0.8019 - loss: 0.0583 - val_Recall: 0.8108 - val_loss: 0.0678 Epoch 5/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 288us/step - Recall: 0.8150 - loss: 0.0574 - val_Recall: 0.8108 - val_loss: 0.0651 Epoch 6/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 288us/step - Recall: 0.8153 - loss: 0.0534 - val_Recall: 0.8153 - val_loss: 0.0630 Epoch 7/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 288us/step - Recall: 0.8270 - loss: 0.0528 - val_Recall: 0.8153 - val_loss: 0.0617 Epoch 8/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 288us/step - Recall: 0.8386 - loss: 0.0499 - val_Recall: 0.8198 - val_loss: 0.0610 Epoch 9/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 293us/step - Recall: 0.8429 - loss: 0.0472 - val_Recall: 0.8198 - val_loss: 0.0605 Epoch 10/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 298us/step - Recall: 0.8511 - loss: 0.0463 - val_Recall: 0.8243 - val_loss: 0.0593 Epoch 11/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 290us/step - Recall: 0.8486 - loss: 0.0450 - val_Recall: 0.8153 - val_loss: 0.0593 Epoch 12/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 288us/step - Recall: 0.8399 - loss: 0.0487 - val_Recall: 0.8333 - val_loss: 0.0582 Epoch 13/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 289us/step - Recall: 0.8537 - loss: 0.0456 - val_Recall: 0.8378 - val_loss: 0.0581 Epoch 14/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 288us/step - Recall: 0.8578 - loss: 0.0414 - val_Recall: 0.8468 - val_loss: 0.0571 Epoch 15/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 289us/step - Recall: 0.8607 - loss: 0.0424 - val_Recall: 0.8333 - val_loss: 0.0573 Epoch 16/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 288us/step - Recall: 0.8760 - loss: 0.0380 - val_Recall: 0.8514 - val_loss: 0.0570 Epoch 17/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 289us/step - Recall: 0.8895 - loss: 0.0366 - val_Recall: 0.8378 - val_loss: 0.0565 Epoch 18/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 292us/step - Recall: 0.8814 - loss: 0.0387 - val_Recall: 0.8514 - val_loss: 0.0562 Epoch 19/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 294us/step - Recall: 0.8683 - loss: 0.0407 - val_Recall: 0.8559 - val_loss: 0.0567 Epoch 20/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 294us/step - Recall: 0.8870 - loss: 0.0393 - val_Recall: 0.8468 - val_loss: 0.0562 Epoch 21/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 294us/step - Recall: 0.8888 - loss: 0.0339 - val_Recall: 0.8468 - val_loss: 0.0561 Epoch 22/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 295us/step - Recall: 0.8764 - loss: 0.0396 - val_Recall: 0.8423 - val_loss: 0.0558 Epoch 23/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 295us/step - Recall: 0.8821 - loss: 0.0355 - val_Recall: 0.8468 - val_loss: 0.0564 Epoch 24/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 294us/step - Recall: 0.8817 - loss: 0.0371 - val_Recall: 0.8514 - val_loss: 0.0558 Epoch 25/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 294us/step - Recall: 0.8818 - loss: 0.0380 - val_Recall: 0.8514 - val_loss: 0.0551 Epoch 26/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 295us/step - Recall: 0.8860 - loss: 0.0340 - val_Recall: 0.8514 - val_loss: 0.0559 Epoch 27/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.8907 - loss: 0.0323 - val_Recall: 0.8514 - val_loss: 0.0553 Epoch 28/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 297us/step - Recall: 0.9005 - loss: 0.0333 - val_Recall: 0.8604 - val_loss: 0.0547 Epoch 29/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 291us/step - Recall: 0.8988 - loss: 0.0340 - val_Recall: 0.8604 - val_loss: 0.0550 Epoch 30/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 287us/step - Recall: 0.8820 - loss: 0.0362 - val_Recall: 0.8468 - val_loss: 0.0551 Epoch 31/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 285us/step - Recall: 0.9074 - loss: 0.0306 - val_Recall: 0.8468 - val_loss: 0.0554 Epoch 32/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 288us/step - Recall: 0.8976 - loss: 0.0311 - val_Recall: 0.8604 - val_loss: 0.0553 Epoch 33/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 287us/step - Recall: 0.8905 - loss: 0.0350 - val_Recall: 0.8514 - val_loss: 0.0555 Epoch 34/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 287us/step - Recall: 0.8894 - loss: 0.0328 - val_Recall: 0.8559 - val_loss: 0.0545 Epoch 35/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 287us/step - Recall: 0.8823 - loss: 0.0360 - val_Recall: 0.8559 - val_loss: 0.0547 Epoch 36/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 285us/step - Recall: 0.8847 - loss: 0.0324 - val_Recall: 0.8559 - val_loss: 0.0557 Epoch 37/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 291us/step - Recall: 0.8976 - loss: 0.0298 - val_Recall: 0.8559 - val_loss: 0.0550 Epoch 38/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 294us/step - Recall: 0.8975 - loss: 0.0295 - val_Recall: 0.8604 - val_loss: 0.0552 Epoch 39/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 296us/step - Recall: 0.8765 - loss: 0.0359 - val_Recall: 0.8514 - val_loss: 0.0553 Epoch 40/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 295us/step - Recall: 0.9074 - loss: 0.0296 - val_Recall: 0.8604 - val_loss: 0.0548 Epoch 41/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 295us/step - Recall: 0.8925 - loss: 0.0327 - val_Recall: 0.8694 - val_loss: 0.0563 Epoch 42/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 294us/step - Recall: 0.9039 - loss: 0.0293 - val_Recall: 0.8559 - val_loss: 0.0563 Epoch 43/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 295us/step - Recall: 0.8966 - loss: 0.0304 - val_Recall: 0.8604 - val_loss: 0.0564 Epoch 44/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 295us/step - Recall: 0.9015 - loss: 0.0298 - val_Recall: 0.8604 - val_loss: 0.0544 Epoch 45/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 287us/step - Recall: 0.8963 - loss: 0.0285 - val_Recall: 0.8604 - val_loss: 0.0545 Epoch 46/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 286us/step - Recall: 0.9091 - loss: 0.0282 - val_Recall: 0.8604 - val_loss: 0.0554 Epoch 47/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 285us/step - Recall: 0.8937 - loss: 0.0310 - val_Recall: 0.8694 - val_loss: 0.0560 Epoch 48/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 285us/step - Recall: 0.9005 - loss: 0.0304 - val_Recall: 0.8514 - val_loss: 0.0551 Epoch 49/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 287us/step - Recall: 0.8933 - loss: 0.0287 - val_Recall: 0.8559 - val_loss: 0.0563 Epoch 50/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 285us/step - Recall: 0.8980 - loss: 0.0263 - val_Recall: 0.8694 - val_loss: 0.0552
print("Time taken in seconds ",end-start)
Time taken in seconds 7.699712038040161
plot(history,'loss')
plot(history,'Recall')
model_1_train_perf = model_performance_classification(model_1, x_train, y_train) #threshold = 0.5
print("Model 1 Train Performance")
print(model_1_train_perf)
print("-"*50)
print("/nModel 1 Validation Performance")
model_1_val_perf = model_performance_classification(model_1,x_val,y_val)
print(model_1_val_perf)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 136us/step Model 1 Train Performance Accuracy Recall Precision F1 Score 0 0.994437 0.953597 0.992995 0.97237 -------------------------------------------------- /nModel 1 Validation Performance 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 159us/step Accuracy Recall Precision F1 Score 0 0.99175 0.934155 0.986035 0.958444
Adding one hdden layer ,increased the recall. this shows model has improved
#let's get classification report on y train and y val for Model 0
y_train_pred_1 = model_1.predict(x_train)
y_val_pred_1 = model_1.predict(x_val)
print("Classification Report - Train data Model_1",end="\n\n")
cr_train_model_1 = classification_report(y_train,y_train_pred_1>0.5)
print(cr_train_model_1)
print("Classification Report - Validation data Model_1",end="\n\n")
cr_val_model_1 = classification_report(y_val,y_val_pred_1>0.5)
print(cr_val_model_1)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 147us/step 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 171us/step Classification Report - Train data Model_1 precision recall f1-score support 0 0.99 1.00 1.00 15112 1 0.99 0.91 0.95 888 accuracy 0.99 16000 macro avg 0.99 0.95 0.97 16000 weighted avg 0.99 0.99 0.99 16000 Classification Report - Validation data Model_1 precision recall f1-score support 0 0.99 1.00 1.00 3778 1 0.98 0.87 0.92 222 accuracy 0.99 4000 macro avg 0.99 0.93 0.96 4000 weighted avg 0.99 0.99 0.99 4000
# Predict probabilities using confusion matrix
y_pred = model_1.predict(x_val)
for i in range(len(y_val)):
if y_pred[i]>0.5:
y_pred[i]=1
else:
y_pred[i]=0
cm = confusion_matrix(y_val,y_pred)
labels = ['True Negative','False Positive','False Negative','True Positive']
make_confusion_matrix(cm,group_names=labels,cmap = "Blues")
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 170us/step
results.loc[1] = [2,[128,64],["relu","relu"],50,32,"sgd",[0.01, "-"],"-","-",history.history["loss"][-1],history.history["val_loss"][-1],history.history["Recall"][-1],history.history["val_Recall"][-1],round(end-start,2)]
results
| # hidden layers | # neurons - hidden layer | activation function - hidden layer | # epochs | batch size | optimizer | learning rate, momentum | weight initializer | regularization | train loss | validation loss | train recall | validation recall | time (secs) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [128] | [relu] | 50 | 32 | [sgd] | [0.01, -] | - | - | 0.040534 | 0.059940 | 0.861486 | 0.851351 | 6.92 |
| 1 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | - | 0.028434 | 0.055182 | 0.902027 | 0.869369 | 7.70 |
Observations¶
- Post adding another hidden layer, validation recall has slightly increased (from 0.85 to 0.88)
- Precision is great with 99% and 97% for predicting generator failure for training and validataion data respectively.
- Model is still overfitting , Train recall - 0.902027and validation Recall - 0.878378 and Gap between train and validation has increased , hence lets try out Regularization in the next model (dropout neurons)
Model 2¶
To keep the same level of precision and to reduce the difference between train and validation recalls we can use regularization techniques.One of the common technique is dropouts.
#clear current keras session
tf.keras.backend.clear_session()
# initializing the neural network
model_2 = Sequential()
# Adding a dense layer with ReLu activation function, with the input dimension determined by the shape of the training data.
model_2.add(Dense(128, activation = 'relu', input_dim = x_train.shape[1]))
model_2.add(Dropout(0.5)) # Adding 50% Dropout after first hidden layer
model_2.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_2.add(Dense(1,activation = 'sigmoid')) #adding output layer with sigmoid as the output is binary 1 or 0
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/core/dense.py:93: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
model_2.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 128) │ 5,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 64) │ 8,256 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 13,569 (53.00 KB)
Trainable params: 13,569 (53.00 KB)
Non-trainable params: 0 (0.00 B)
#SGD Optiomizer
optimizer = keras.optimizers.SGD() # defining SGD as the optimizer to be used
model_2.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["Recall"])
start = time.time()
history = model_2.fit(x_train, y_train, validation_data=(x_val,y_val) , batch_size=batch_size, epochs=epochs)
end=time.time()
Epoch 1/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 415us/step - Recall: 0.2015 - loss: 0.2092 - val_Recall: 0.5315 - val_loss: 0.1075 Epoch 2/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 305us/step - Recall: 0.4688 - loss: 0.1357 - val_Recall: 0.6757 - val_loss: 0.0923 Epoch 3/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.5540 - loss: 0.1116 - val_Recall: 0.7342 - val_loss: 0.0838 Epoch 4/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.5992 - loss: 0.1044 - val_Recall: 0.7523 - val_loss: 0.0803 Epoch 5/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.6374 - loss: 0.0981 - val_Recall: 0.7613 - val_loss: 0.0769 Epoch 6/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 305us/step - Recall: 0.6628 - loss: 0.0921 - val_Recall: 0.7658 - val_loss: 0.0742 Epoch 7/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.6955 - loss: 0.0880 - val_Recall: 0.7748 - val_loss: 0.0713 Epoch 8/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 305us/step - Recall: 0.7119 - loss: 0.0796 - val_Recall: 0.7838 - val_loss: 0.0701 Epoch 9/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 305us/step - Recall: 0.7167 - loss: 0.0740 - val_Recall: 0.7838 - val_loss: 0.0688 Epoch 10/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.7067 - loss: 0.0772 - val_Recall: 0.7883 - val_loss: 0.0670 Epoch 11/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 305us/step - Recall: 0.7361 - loss: 0.0788 - val_Recall: 0.7973 - val_loss: 0.0658 Epoch 12/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 305us/step - Recall: 0.7508 - loss: 0.0683 - val_Recall: 0.7973 - val_loss: 0.0645 Epoch 13/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.7452 - loss: 0.0774 - val_Recall: 0.8108 - val_loss: 0.0642 Epoch 14/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.7685 - loss: 0.0701 - val_Recall: 0.8153 - val_loss: 0.0630 Epoch 15/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.7562 - loss: 0.0732 - val_Recall: 0.8153 - val_loss: 0.0631 Epoch 16/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 303us/step - Recall: 0.7654 - loss: 0.0734 - val_Recall: 0.8198 - val_loss: 0.0629 Epoch 17/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 303us/step - Recall: 0.7698 - loss: 0.0634 - val_Recall: 0.8288 - val_loss: 0.0617 Epoch 18/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 302us/step - Recall: 0.7775 - loss: 0.0683 - val_Recall: 0.8333 - val_loss: 0.0611 Epoch 19/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 303us/step - Recall: 0.7948 - loss: 0.0621 - val_Recall: 0.8333 - val_loss: 0.0608 Epoch 20/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 305us/step - Recall: 0.7857 - loss: 0.0641 - val_Recall: 0.8333 - val_loss: 0.0608 Epoch 21/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 303us/step - Recall: 0.7854 - loss: 0.0645 - val_Recall: 0.8378 - val_loss: 0.0602 Epoch 22/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.7752 - loss: 0.0607 - val_Recall: 0.8333 - val_loss: 0.0604 Epoch 23/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.8119 - loss: 0.0536 - val_Recall: 0.8333 - val_loss: 0.0592 Epoch 24/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 303us/step - Recall: 0.7860 - loss: 0.0564 - val_Recall: 0.8423 - val_loss: 0.0595 Epoch 25/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.7968 - loss: 0.0665 - val_Recall: 0.8423 - val_loss: 0.0585 Epoch 26/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 305us/step - Recall: 0.8234 - loss: 0.0550 - val_Recall: 0.8378 - val_loss: 0.0583 Epoch 27/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.7868 - loss: 0.0623 - val_Recall: 0.8288 - val_loss: 0.0592 Epoch 28/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.8350 - loss: 0.0502 - val_Recall: 0.8514 - val_loss: 0.0578 Epoch 29/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 308us/step - Recall: 0.8165 - loss: 0.0613 - val_Recall: 0.8514 - val_loss: 0.0577 Epoch 30/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 300us/step - Recall: 0.8129 - loss: 0.0591 - val_Recall: 0.8514 - val_loss: 0.0577 Epoch 31/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 301us/step - Recall: 0.8205 - loss: 0.0601 - val_Recall: 0.8514 - val_loss: 0.0572 Epoch 32/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 314us/step - Recall: 0.8162 - loss: 0.0543 - val_Recall: 0.8514 - val_loss: 0.0573 Epoch 33/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 310us/step - Recall: 0.8093 - loss: 0.0528 - val_Recall: 0.8514 - val_loss: 0.0573 Epoch 34/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 305us/step - Recall: 0.8224 - loss: 0.0591 - val_Recall: 0.8468 - val_loss: 0.0578 Epoch 35/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 306us/step - Recall: 0.8402 - loss: 0.0457 - val_Recall: 0.8514 - val_loss: 0.0567 Epoch 36/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 305us/step - Recall: 0.8131 - loss: 0.0549 - val_Recall: 0.8514 - val_loss: 0.0570 Epoch 37/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 306us/step - Recall: 0.7975 - loss: 0.0606 - val_Recall: 0.8468 - val_loss: 0.0568 Epoch 38/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 305us/step - Recall: 0.8121 - loss: 0.0497 - val_Recall: 0.8604 - val_loss: 0.0561 Epoch 39/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 353us/step - Recall: 0.8261 - loss: 0.0503 - val_Recall: 0.8604 - val_loss: 0.0567 Epoch 40/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 307us/step - Recall: 0.8445 - loss: 0.0490 - val_Recall: 0.8559 - val_loss: 0.0565 Epoch 41/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.8525 - loss: 0.0468 - val_Recall: 0.8514 - val_loss: 0.0565 Epoch 42/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.8366 - loss: 0.0543 - val_Recall: 0.8559 - val_loss: 0.0561 Epoch 43/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.8124 - loss: 0.0597 - val_Recall: 0.8604 - val_loss: 0.0558 Epoch 44/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.8115 - loss: 0.0539 - val_Recall: 0.8514 - val_loss: 0.0559 Epoch 45/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.8481 - loss: 0.0525 - val_Recall: 0.8604 - val_loss: 0.0557 Epoch 46/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 303us/step - Recall: 0.8656 - loss: 0.0445 - val_Recall: 0.8604 - val_loss: 0.0553 Epoch 47/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.8173 - loss: 0.0556 - val_Recall: 0.8514 - val_loss: 0.0556 Epoch 48/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 305us/step - Recall: 0.8575 - loss: 0.0487 - val_Recall: 0.8559 - val_loss: 0.0554 Epoch 49/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 304us/step - Recall: 0.8458 - loss: 0.0500 - val_Recall: 0.8604 - val_loss: 0.0557 Epoch 50/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 305us/step - Recall: 0.8193 - loss: 0.0535 - val_Recall: 0.8559 - val_loss: 0.0561
print("Time taken in seconds ",end-start)
Time taken in seconds 8.07505202293396
plot(history,'loss')
plot(history,'Recall')
- The validation recall curve is consistently above the training recall curve, which suggests that dropout helped reduce overfitting.
- However, a 50% dropout might be too aggressive; we will try a lower dropout rate to strike a better balance between performance and generalization.
model_2_train_perf = model_performance_classification(model_2, x_train, y_train) #threshold = 0.5
print("Model 2 Train Performance")
print(model_2_train_perf)
print("-"*50)
print("Model 2 Validation Performance")
model_2_val_perf = model_performance_classification(model_2,x_val,y_val)
print(model_2_val_perf)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 138us/step Model 2 Train Performance Accuracy Recall Precision F1 Score 0 0.992938 0.941144 0.990906 0.964526 -------------------------------------------------- Model 2 Validation Performance 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 162us/step Accuracy Recall Precision F1 Score 0 0.99025 0.927002 0.978026 0.950888
#let's get classification report on y train and y val for Model 2
y_train_pred_2 = model_2.predict(x_train)
y_val_pred_2 = model_2.predict(x_val)
print("Classification Report - Train data Model_2",end="\n\n")
cr_train_model_2 = classification_report(y_train,y_train_pred_2>0.5)
print(cr_train_model_2)
print("Classification Report - Validation data Model_2",end="\n\n")
cr_val_model_2 = classification_report(y_val,y_val_pred_2>0.5)
print(cr_val_model_2)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 147us/step 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 169us/step Classification Report - Train data Model_2 precision recall f1-score support 0 0.99 1.00 1.00 15112 1 0.99 0.88 0.93 888 accuracy 0.99 16000 macro avg 0.99 0.94 0.96 16000 weighted avg 0.99 0.99 0.99 16000 Classification Report - Validation data Model_2 precision recall f1-score support 0 0.99 1.00 0.99 3778 1 0.96 0.86 0.91 222 accuracy 0.99 4000 macro avg 0.98 0.93 0.95 4000 weighted avg 0.99 0.99 0.99 4000
# Predict probabilities using confusion matrix
y_pred = model_2.predict(x_val)
for i in range(len(y_val)):
if y_pred[i]>0.5:
y_pred[i]=1
else:
y_pred[i]=0
cm = confusion_matrix(y_val,y_pred)
labels = ['True Negative','False Positive','False Negative','True Positive']
make_confusion_matrix(cm,group_names=labels,cmap = "Blues")
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 175us/step
results.loc[2] = [2,[128,64],["relu","relu"],50,32,"sgd",[0.01, "-"],"-","dropout(0.5)",history.history["loss"][-1],history.history["val_loss"][-1],history.history["Recall"][-1],history.history["val_Recall"][-1],round(end-start,2)]
results
| # hidden layers | # neurons - hidden layer | activation function - hidden layer | # epochs | batch size | optimizer | learning rate, momentum | weight initializer | regularization | train loss | validation loss | train recall | validation recall | time (secs) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [128] | [relu] | 50 | 32 | [sgd] | [0.01, -] | - | - | 0.040534 | 0.059940 | 0.861486 | 0.851351 | 6.92 |
| 1 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | - | 0.028434 | 0.055182 | 0.902027 | 0.869369 | 7.70 |
| 2 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.5) | 0.050771 | 0.056083 | 0.834459 | 0.855856 | 8.08 |
Observations¶
- Post applying 50% Dropout , we see the validayion recall of the model is now higher than train recall , this means the dropout helped reducing overfitting
- However, the Validation recall dropped to ~85% compared to previous model ~88%. We can try adding another layer and decrease the dropout to 30%
- Another options is to insert Batch Normalization before or after the activation fucntion to stabilize the training and potentially increase recall.
Model 3¶
Updating dropout to 30% and adding another hidden layer
#clear current keras session
tf.keras.backend.clear_session()
# initializing the neural network
model_3 = Sequential()
# Adding a dense layer with ReLu activation function, with the input dimension determined by the shape of the training data.
model_3.add(Dense(128, activation = 'relu', input_dim = x_train.shape[1]))
model_3.add(Dropout(0.3)) # Adding 30% Dropout after first hidden layer
model_3.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_3.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_3.add(Dense(1,activation = 'sigmoid')) #adding output layer with sigmoid as the output is binary 1 or 0
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/core/dense.py:93: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
model_3.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 128) │ 5,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 64) │ 8,256 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 64) │ 4,160 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 17,729 (69.25 KB)
Trainable params: 17,729 (69.25 KB)
Non-trainable params: 0 (0.00 B)
#SGD Optiomizer
optimizer = keras.optimizers.SGD() # defining SGD as the optimizer to be used
model_3.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["Recall"])
start = time.time()
history = model_3.fit(x_train, y_train, validation_data=(x_val,y_val) , batch_size=batch_size, epochs=epochs)
end=time.time()
Epoch 1/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 435us/step - Recall: 0.1136 - loss: 0.1984 - val_Recall: 0.6081 - val_loss: 0.1010 Epoch 2/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 326us/step - Recall: 0.5237 - loss: 0.1175 - val_Recall: 0.7072 - val_loss: 0.0837 Epoch 3/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 325us/step - Recall: 0.6236 - loss: 0.0951 - val_Recall: 0.7793 - val_loss: 0.0762 Epoch 4/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 325us/step - Recall: 0.6721 - loss: 0.0903 - val_Recall: 0.7658 - val_loss: 0.0713 Epoch 5/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 325us/step - Recall: 0.7158 - loss: 0.0804 - val_Recall: 0.8108 - val_loss: 0.0673 Epoch 6/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.7613 - loss: 0.0781 - val_Recall: 0.8063 - val_loss: 0.0648 Epoch 7/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 323us/step - Recall: 0.7355 - loss: 0.0670 - val_Recall: 0.8063 - val_loss: 0.0624 Epoch 8/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.7814 - loss: 0.0717 - val_Recall: 0.8243 - val_loss: 0.0601 Epoch 9/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 323us/step - Recall: 0.7632 - loss: 0.0678 - val_Recall: 0.8063 - val_loss: 0.0613 Epoch 10/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8116 - loss: 0.0616 - val_Recall: 0.8198 - val_loss: 0.0592 Epoch 11/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.7724 - loss: 0.0706 - val_Recall: 0.8423 - val_loss: 0.0572 Epoch 12/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8221 - loss: 0.0584 - val_Recall: 0.8288 - val_loss: 0.0560 Epoch 13/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 325us/step - Recall: 0.8261 - loss: 0.0527 - val_Recall: 0.8288 - val_loss: 0.0559 Epoch 14/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.7889 - loss: 0.0593 - val_Recall: 0.8423 - val_loss: 0.0552 Epoch 15/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8107 - loss: 0.0586 - val_Recall: 0.8333 - val_loss: 0.0569 Epoch 16/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 325us/step - Recall: 0.8211 - loss: 0.0512 - val_Recall: 0.8423 - val_loss: 0.0546 Epoch 17/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 325us/step - Recall: 0.8335 - loss: 0.0505 - val_Recall: 0.8333 - val_loss: 0.0543 Epoch 18/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 323us/step - Recall: 0.8234 - loss: 0.0527 - val_Recall: 0.8378 - val_loss: 0.0541 Epoch 19/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8519 - loss: 0.0480 - val_Recall: 0.8378 - val_loss: 0.0553 Epoch 20/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 321us/step - Recall: 0.8390 - loss: 0.0461 - val_Recall: 0.8423 - val_loss: 0.0532 Epoch 21/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8504 - loss: 0.0483 - val_Recall: 0.8468 - val_loss: 0.0530 Epoch 22/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 323us/step - Recall: 0.8430 - loss: 0.0506 - val_Recall: 0.8378 - val_loss: 0.0550 Epoch 23/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8305 - loss: 0.0530 - val_Recall: 0.8423 - val_loss: 0.0534 Epoch 24/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8328 - loss: 0.0497 - val_Recall: 0.8468 - val_loss: 0.0540 Epoch 25/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8599 - loss: 0.0416 - val_Recall: 0.8423 - val_loss: 0.0533 Epoch 26/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 327us/step - Recall: 0.8539 - loss: 0.0427 - val_Recall: 0.8468 - val_loss: 0.0526 Epoch 27/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8475 - loss: 0.0454 - val_Recall: 0.8423 - val_loss: 0.0535 Epoch 28/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 322us/step - Recall: 0.8439 - loss: 0.0423 - val_Recall: 0.8468 - val_loss: 0.0531 Epoch 29/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8474 - loss: 0.0506 - val_Recall: 0.8514 - val_loss: 0.0530 Epoch 30/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8615 - loss: 0.0441 - val_Recall: 0.8514 - val_loss: 0.0526 Epoch 31/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8546 - loss: 0.0460 - val_Recall: 0.8423 - val_loss: 0.0533 Epoch 32/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 323us/step - Recall: 0.8564 - loss: 0.0473 - val_Recall: 0.8468 - val_loss: 0.0547 Epoch 33/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 323us/step - Recall: 0.8672 - loss: 0.0455 - val_Recall: 0.8468 - val_loss: 0.0532 Epoch 34/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8649 - loss: 0.0430 - val_Recall: 0.8514 - val_loss: 0.0526 Epoch 35/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8842 - loss: 0.0382 - val_Recall: 0.8604 - val_loss: 0.0522 Epoch 36/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8665 - loss: 0.0454 - val_Recall: 0.8604 - val_loss: 0.0523 Epoch 37/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 325us/step - Recall: 0.8820 - loss: 0.0380 - val_Recall: 0.8604 - val_loss: 0.0528 Epoch 38/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 325us/step - Recall: 0.8672 - loss: 0.0425 - val_Recall: 0.8649 - val_loss: 0.0522 Epoch 39/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8836 - loss: 0.0377 - val_Recall: 0.8604 - val_loss: 0.0523 Epoch 40/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8790 - loss: 0.0404 - val_Recall: 0.8694 - val_loss: 0.0523 Epoch 41/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8648 - loss: 0.0446 - val_Recall: 0.8739 - val_loss: 0.0528 Epoch 42/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8733 - loss: 0.0417 - val_Recall: 0.8739 - val_loss: 0.0520 Epoch 43/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 322us/step - Recall: 0.8861 - loss: 0.0365 - val_Recall: 0.8649 - val_loss: 0.0525 Epoch 44/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8588 - loss: 0.0413 - val_Recall: 0.8739 - val_loss: 0.0531 Epoch 45/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8818 - loss: 0.0387 - val_Recall: 0.8649 - val_loss: 0.0534 Epoch 46/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8714 - loss: 0.0406 - val_Recall: 0.8649 - val_loss: 0.0532 Epoch 47/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8665 - loss: 0.0455 - val_Recall: 0.8649 - val_loss: 0.0536 Epoch 48/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 323us/step - Recall: 0.8700 - loss: 0.0408 - val_Recall: 0.8649 - val_loss: 0.0552 Epoch 49/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.8816 - loss: 0.0366 - val_Recall: 0.8649 - val_loss: 0.0550 Epoch 50/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 324us/step - Recall: 0.8851 - loss: 0.0349 - val_Recall: 0.8694 - val_loss: 0.0537
print("Time taken in seconds ",end-start)
Time taken in seconds 8.56450891494751
plot(history,'loss')
plot(history,'Recall')
30% dropout with one additonal hidden layer gives a better result than previous model. This model shows an early convergence of train and validation recall (at ~10 Epochs)
model_3_train_perf = model_performance_classification(model_3, x_train, y_train) #threshold = 0.5
print("Model 3 Train Performance")
print(model_3_train_perf)
print("-"*50)
print("Model 3 Validation Performance")
model_3_val_perf = model_performance_classification(model_3,x_val,y_val)
print(model_3_val_perf)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 144us/step Model 3 Train Performance Accuracy Recall Precision F1 Score 0 0.99375 0.948463 0.991422 0.968834 -------------------------------------------------- Model 3 Validation Performance 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 166us/step Accuracy Recall Precision F1 Score 0 0.9915 0.934023 0.98356 0.957281
#let's get classification report on y train and y val for Model 3
y_train_pred_3 = model_3.predict(x_train)
y_val_pred_3 = model_3.predict(x_val)
print("Classification Report - Train data Model_3",end="\n\n")
cr_train_model_3 = classification_report(y_train,y_train_pred_3>0.5)
print(cr_train_model_3)
print("Classification Report - Validation data Model_3",end="\n\n")
cr_val_model_3 = classification_report(y_val,y_val_pred_3>0.5)
print(cr_val_model_3)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 151us/step 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 173us/step Classification Report - Train data Model_3 precision recall f1-score support 0 0.99 1.00 1.00 15112 1 0.99 0.90 0.94 888 accuracy 0.99 16000 macro avg 0.99 0.95 0.97 16000 weighted avg 0.99 0.99 0.99 16000 Classification Report - Validation data Model_3 precision recall f1-score support 0 0.99 1.00 1.00 3778 1 0.97 0.87 0.92 222 accuracy 0.99 4000 macro avg 0.98 0.93 0.96 4000 weighted avg 0.99 0.99 0.99 4000
# Predict probabilities using confusion matrix
y_pred = model_3.predict(x_val)
for i in range(len(y_val)):
if y_pred[i]>0.5:
y_pred[i]=1
else:
y_pred[i]=0
cm = confusion_matrix(y_val,y_pred)
labels = ['True Negative','False Positive','False Negative','True Positive']
make_confusion_matrix(cm,group_names=labels,cmap = "Blues")
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 176us/step
results.loc[3] = [3,[128,64,64],["relu","relu","relu"],50,32,"sgd",[0.01, "-"],"-","dropout(0.3)",history.history["loss"][-1],history.history["val_loss"][-1],history.history["Recall"][-1],history.history["val_Recall"][-1],round(end-start,2)]
results
| # hidden layers | # neurons - hidden layer | activation function - hidden layer | # epochs | batch size | optimizer | learning rate, momentum | weight initializer | regularization | train loss | validation loss | train recall | validation recall | time (secs) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [128] | [relu] | 50 | 32 | [sgd] | [0.01, -] | - | - | 0.040534 | 0.059940 | 0.861486 | 0.851351 | 6.92 |
| 1 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | - | 0.028434 | 0.055182 | 0.902027 | 0.869369 | 7.70 |
| 2 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.5) | 0.050771 | 0.056083 | 0.834459 | 0.855856 | 8.08 |
| 3 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.3) | 0.038127 | 0.053664 | 0.880631 | 0.869369 | 8.56 |
Observations¶
- Decreasing dropout percentage to 30% from 50% minimzed the gap between train and validation recall
- Both Train and Validation recall has increased compared to 50% dropout model.
- We will now stick with 30% dropout for now , also as we see imbalance in class distribution, we will now have to use class weights to allow the model to give proportionallly more importance to minority class(e.g. False Negative)
Model 4¶
# Calculate class weights for imbalanced dataset
cw = (y_train.shape[0]) / np.bincount(y_train.astype(int)) # Convert y_train to integers
# Create a dictionary mapping class indices to their respective class weights
cw_dict = {}
for i in range(cw.shape[0]):
cw_dict[i] = cw[i]
cw_dict
{0: 1.0587612493382743, 1: 18.01801801801802}
#clear current keras session
tf.keras.backend.clear_session()
# initializing the neural network
model_4 = Sequential()
# Adding a dense layer with ReLu activation function, with the input dimension determined by the shape of the training data.
model_4.add(Dense(128, activation = 'relu', input_dim = x_train.shape[1]))
model_4.add(Dropout(0.3)) # Adding 30% Dropout after first hidden layer
model_4.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_4.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_4.add(Dense(1,activation = 'sigmoid')) #adding output layer with sigmoid as the output is binary 1 or 0
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/core/dense.py:93: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
model_4.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 128) │ 5,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 64) │ 8,256 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 64) │ 4,160 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 17,729 (69.25 KB)
Trainable params: 17,729 (69.25 KB)
Non-trainable params: 0 (0.00 B)
#SGD Optiomizer
optimizer = keras.optimizers.SGD() # defining SGD as the optimizer to be used
model_4.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["Recall"])
start = time.time()
history = model_4.fit(x_train, y_train, validation_data=(x_val,y_val) , batch_size=batch_size, epochs=epochs,class_weight=cw_dict)
end=time.time()
Epoch 1/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 439us/step - Recall: 0.8212 - loss: 0.8307 - val_Recall: 0.8874 - val_loss: 0.2598 Epoch 2/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 321us/step - Recall: 0.8902 - loss: 0.5208 - val_Recall: 0.8874 - val_loss: 0.2392 Epoch 3/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 322us/step - Recall: 0.9039 - loss: 0.4192 - val_Recall: 0.8829 - val_loss: 0.2027 Epoch 4/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 320us/step - Recall: 0.9008 - loss: 0.4242 - val_Recall: 0.8874 - val_loss: 0.1191 Epoch 5/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 320us/step - Recall: 0.9017 - loss: 0.4073 - val_Recall: 0.9009 - val_loss: 0.1487 Epoch 6/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 322us/step - Recall: 0.9115 - loss: 0.3735 - val_Recall: 0.8964 - val_loss: 0.1968 Epoch 7/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 321us/step - Recall: 0.8996 - loss: 0.3732 - val_Recall: 0.8874 - val_loss: 0.1157 Epoch 8/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 322us/step - Recall: 0.9065 - loss: 0.3482 - val_Recall: 0.8919 - val_loss: 0.1879 Epoch 9/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 321us/step - Recall: 0.9118 - loss: 0.3780 - val_Recall: 0.9009 - val_loss: 0.1402 Epoch 10/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 322us/step - Recall: 0.9205 - loss: 0.3428 - val_Recall: 0.8919 - val_loss: 0.1305 Epoch 11/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 322us/step - Recall: 0.8920 - loss: 0.3725 - val_Recall: 0.8919 - val_loss: 0.1062 Epoch 12/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 317us/step - Recall: 0.9045 - loss: 0.3278 - val_Recall: 0.8919 - val_loss: 0.1247 Epoch 13/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 318us/step - Recall: 0.9121 - loss: 0.3249 - val_Recall: 0.8919 - val_loss: 0.1439 Epoch 14/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 323us/step - Recall: 0.9066 - loss: 0.3283 - val_Recall: 0.8874 - val_loss: 0.1218 Epoch 15/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 318us/step - Recall: 0.9074 - loss: 0.3492 - val_Recall: 0.8964 - val_loss: 0.1152 Epoch 16/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 320us/step - Recall: 0.9108 - loss: 0.3124 - val_Recall: 0.8919 - val_loss: 0.1065 Epoch 17/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 320us/step - Recall: 0.9009 - loss: 0.3493 - val_Recall: 0.8919 - val_loss: 0.1138 Epoch 18/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 318us/step - Recall: 0.9171 - loss: 0.3037 - val_Recall: 0.9009 - val_loss: 0.1623 Epoch 19/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 321us/step - Recall: 0.9143 - loss: 0.3139 - val_Recall: 0.8964 - val_loss: 0.1379 Epoch 20/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 322us/step - Recall: 0.9253 - loss: 0.2829 - val_Recall: 0.8919 - val_loss: 0.1088 Epoch 21/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 321us/step - Recall: 0.9148 - loss: 0.3155 - val_Recall: 0.9054 - val_loss: 0.1530 Epoch 22/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 321us/step - Recall: 0.9166 - loss: 0.2788 - val_Recall: 0.8964 - val_loss: 0.1282 Epoch 23/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 320us/step - Recall: 0.9054 - loss: 0.3134 - val_Recall: 0.8874 - val_loss: 0.1089 Epoch 24/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 319us/step - Recall: 0.9192 - loss: 0.2828 - val_Recall: 0.8919 - val_loss: 0.1232 Epoch 25/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 321us/step - Recall: 0.9052 - loss: 0.3074 - val_Recall: 0.8919 - val_loss: 0.1094 Epoch 26/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 318us/step - Recall: 0.9252 - loss: 0.2696 - val_Recall: 0.8919 - val_loss: 0.1328 Epoch 27/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 319us/step - Recall: 0.9204 - loss: 0.2734 - val_Recall: 0.9009 - val_loss: 0.1442 Epoch 28/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 319us/step - Recall: 0.9229 - loss: 0.2718 - val_Recall: 0.8829 - val_loss: 0.1051 Epoch 29/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 319us/step - Recall: 0.9238 - loss: 0.2598 - val_Recall: 0.8964 - val_loss: 0.1007 Epoch 30/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 319us/step - Recall: 0.9232 - loss: 0.2869 - val_Recall: 0.9009 - val_loss: 0.1579 Epoch 31/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 322us/step - Recall: 0.9083 - loss: 0.2964 - val_Recall: 0.8919 - val_loss: 0.1282 Epoch 32/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 318us/step - Recall: 0.9205 - loss: 0.2689 - val_Recall: 0.8874 - val_loss: 0.1028 Epoch 33/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 318us/step - Recall: 0.9101 - loss: 0.2843 - val_Recall: 0.9009 - val_loss: 0.1693 Epoch 34/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 322us/step - Recall: 0.9151 - loss: 0.3001 - val_Recall: 0.8874 - val_loss: 0.1153 Epoch 35/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 320us/step - Recall: 0.9300 - loss: 0.2463 - val_Recall: 0.8919 - val_loss: 0.1109 Epoch 36/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 319us/step - Recall: 0.9223 - loss: 0.2401 - val_Recall: 0.8919 - val_loss: 0.1203 Epoch 37/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 320us/step - Recall: 0.9279 - loss: 0.2393 - val_Recall: 0.8964 - val_loss: 0.1216 Epoch 38/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 319us/step - Recall: 0.9164 - loss: 0.2636 - val_Recall: 0.8964 - val_loss: 0.1389 Epoch 39/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 318us/step - Recall: 0.9225 - loss: 0.2582 - val_Recall: 0.8919 - val_loss: 0.1043 Epoch 40/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 319us/step - Recall: 0.9197 - loss: 0.2494 - val_Recall: 0.8919 - val_loss: 0.1003 Epoch 41/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 321us/step - Recall: 0.9254 - loss: 0.2498 - val_Recall: 0.8964 - val_loss: 0.1129 Epoch 42/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 320us/step - Recall: 0.9088 - loss: 0.2779 - val_Recall: 0.8919 - val_loss: 0.1320 Epoch 43/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 319us/step - Recall: 0.9282 - loss: 0.2360 - val_Recall: 0.8919 - val_loss: 0.1024 Epoch 44/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 318us/step - Recall: 0.9184 - loss: 0.2546 - val_Recall: 0.8919 - val_loss: 0.0963 Epoch 45/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 319us/step - Recall: 0.9142 - loss: 0.2614 - val_Recall: 0.8964 - val_loss: 0.1444 Epoch 46/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 320us/step - Recall: 0.9372 - loss: 0.2167 - val_Recall: 0.8964 - val_loss: 0.1056 Epoch 47/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 320us/step - Recall: 0.9283 - loss: 0.2475 - val_Recall: 0.8964 - val_loss: 0.1172 Epoch 48/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 319us/step - Recall: 0.9184 - loss: 0.2520 - val_Recall: 0.8964 - val_loss: 0.1421 Epoch 49/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 320us/step - Recall: 0.9247 - loss: 0.2428 - val_Recall: 0.8964 - val_loss: 0.1270 Epoch 50/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 320us/step - Recall: 0.9397 - loss: 0.2074 - val_Recall: 0.8919 - val_loss: 0.1085
print("Time taken in seconds ",end-start)
Time taken in seconds 8.464919090270996
plot(history,'loss')
plot(history,'Recall')
model_4_train_perf = model_performance_classification(model_4, x_train, y_train) #threshold = 0.5
print("Model 4 Train Performance")
print(model_4_train_perf)
print("-"*50)
print("Model 4 Validation Performance")
model_4_val_perf = model_performance_classification(model_4,x_val,y_val)
print(model_4_val_perf)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 147us/step Model 4 Train Performance Accuracy Recall Precision F1 Score 0 0.989812 0.961218 0.943856 0.952347 -------------------------------------------------- Model 4 Validation Performance 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 162us/step Accuracy Recall Precision F1 Score 0 0.9875 0.942505 0.938786 0.940637
#let's get classification report on y train and y val for Model 4
y_train_pred_4 = model_4.predict(x_train)
y_val_pred_4 = model_4.predict(x_val)
print("Classification Report - Train data Model_4",end="\n\n")
cr_train_model_4 = classification_report(y_train,y_train_pred_4>0.5)
print(cr_train_model_4)
print("Classification Report - Validation data Model_4",end="\n\n")
cr_val_model_4 = classification_report(y_val,y_val_pred_4>0.5)
print(cr_val_model_4)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 151us/step 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 176us/step Classification Report - Train data Model_4 precision recall f1-score support 0 1.00 0.99 0.99 15112 1 0.89 0.93 0.91 888 accuracy 0.99 16000 macro avg 0.94 0.96 0.95 16000 weighted avg 0.99 0.99 0.99 16000 Classification Report - Validation data Model_4 precision recall f1-score support 0 0.99 0.99 0.99 3778 1 0.88 0.89 0.89 222 accuracy 0.99 4000 macro avg 0.94 0.94 0.94 4000 weighted avg 0.99 0.99 0.99 4000
# Predict probabilities using confusion matrix
y_pred = model_4.predict(x_val)
for i in range(len(y_val)):
if y_pred[i]>0.5:
y_pred[i]=1
else:
y_pred[i]=0
cm = confusion_matrix(y_val,y_pred)
labels = ['True Negative','False Positive','False Negative','True Positive']
make_confusion_matrix(cm,group_names=labels,cmap = "Blues")
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 179us/step
results.loc[4] = [3,[128,64,64],["relu","relu","relu"],50,32,"sgd",[0.01, "-"],"Xavier - CW ","dropout(0.3)",history.history["loss"][-1],history.history["val_loss"][-1],history.history["Recall"][-1],history.history["val_Recall"][-1],round(end-start,2)]
results
| # hidden layers | # neurons - hidden layer | activation function - hidden layer | # epochs | batch size | optimizer | learning rate, momentum | weight initializer | regularization | train loss | validation loss | train recall | validation recall | time (secs) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [128] | [relu] | 50 | 32 | [sgd] | [0.01, -] | - | - | 0.040534 | 0.059940 | 0.861486 | 0.851351 | 6.92 |
| 1 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | - | 0.028434 | 0.055182 | 0.902027 | 0.869369 | 7.70 |
| 2 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.5) | 0.050771 | 0.056083 | 0.834459 | 0.855856 | 8.08 |
| 3 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.3) | 0.038127 | 0.053664 | 0.880631 | 0.869369 | 8.56 |
| 4 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | Xavier - CW | dropout(0.3) | 0.245499 | 0.108515 | 0.925676 | 0.891892 | 8.46 |
Observations¶
- Applying class weights to handle the imbalance dataset, the recall improved.
- The False Positive class (type 1 Error) increased form 0.07% to 0.95% , this shows applying class weights impvoed the model.
- Both Train and Validation recall has increased compared to previous model.
- We also see significant oscillation in both Train and validation recall, which might suggest instability in the learning process
- Lets introduce some momentum to increase the magnitude of weight updates
- then we might move to another Optimizer ,Adam and try for smoother reclal curves, faster convergence and better handling of sparse gradient and class imbalance
- We might also try Batch Normalization to stabalize and standardized activation ,reducing internal convaiate shift . this will help to reduce oscillations and may imporve the recall
Model 5¶
#clear current keras session
tf.keras.backend.clear_session()
# initializing the neural network
model_5 = Sequential()
# Adding a dense layer with ReLu activation function, with the input dimension determined by the shape of the training data.
model_5.add(Dense(128, activation = 'relu', input_dim = x_train.shape[1]))
model_5.add(Dropout(0.3)) # Adding 30% Dropout after first hidden layer
model_5.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_5.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_5.add(Dense(1,activation = 'sigmoid')) #adding output layer with sigmoid as the output is binary 1 or 0
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/core/dense.py:93: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
model_5.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 128) │ 5,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 64) │ 8,256 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 64) │ 4,160 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 17,729 (69.25 KB)
Trainable params: 17,729 (69.25 KB)
Non-trainable params: 0 (0.00 B)
#SGD Optiomizer
mom = 0.3
optimizer = keras.optimizers.SGD(momentum=mom) # defining SGD as the optimizer to be used
model_5.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["Recall"])
start = time.time()
history = model_5.fit(x_train, y_train, validation_data=(x_val,y_val) , batch_size=batch_size, epochs=epochs,class_weight=cw_dict)
end=time.time()
Epoch 1/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 457us/step - Recall: 0.8293 - loss: 0.8449 - val_Recall: 0.8874 - val_loss: 0.2408 Epoch 2/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 331us/step - Recall: 0.8989 - loss: 0.4688 - val_Recall: 0.8874 - val_loss: 0.2295 Epoch 3/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 333us/step - Recall: 0.8926 - loss: 0.4525 - val_Recall: 0.8829 - val_loss: 0.1341 Epoch 4/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 330us/step - Recall: 0.8897 - loss: 0.4305 - val_Recall: 0.9009 - val_loss: 0.1757 Epoch 5/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9030 - loss: 0.4031 - val_Recall: 0.8829 - val_loss: 0.1624 Epoch 6/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 331us/step - Recall: 0.9040 - loss: 0.3513 - val_Recall: 0.9009 - val_loss: 0.1511 Epoch 7/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9079 - loss: 0.3484 - val_Recall: 0.8829 - val_loss: 0.1854 Epoch 8/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 333us/step - Recall: 0.8962 - loss: 0.3684 - val_Recall: 0.8784 - val_loss: 0.1131 Epoch 9/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 333us/step - Recall: 0.9214 - loss: 0.3132 - val_Recall: 0.8964 - val_loss: 0.1699 Epoch 10/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 332us/step - Recall: 0.9108 - loss: 0.3461 - val_Recall: 0.8829 - val_loss: 0.1292 Epoch 11/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 333us/step - Recall: 0.9150 - loss: 0.3250 - val_Recall: 0.8784 - val_loss: 0.1670 Epoch 12/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 330us/step - Recall: 0.9219 - loss: 0.3285 - val_Recall: 0.8829 - val_loss: 0.1087 Epoch 13/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 332us/step - Recall: 0.9143 - loss: 0.3222 - val_Recall: 0.8829 - val_loss: 0.1094 Epoch 14/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 332us/step - Recall: 0.9014 - loss: 0.3308 - val_Recall: 0.8964 - val_loss: 0.1953 Epoch 15/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 332us/step - Recall: 0.9206 - loss: 0.2950 - val_Recall: 0.8829 - val_loss: 0.1426 Epoch 16/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 383us/step - Recall: 0.9203 - loss: 0.2973 - val_Recall: 0.8784 - val_loss: 0.1159 Epoch 17/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 336us/step - Recall: 0.9115 - loss: 0.3155 - val_Recall: 0.8874 - val_loss: 0.1341 Epoch 18/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 335us/step - Recall: 0.9134 - loss: 0.3245 - val_Recall: 0.8829 - val_loss: 0.1180 Epoch 19/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 335us/step - Recall: 0.9209 - loss: 0.2860 - val_Recall: 0.8964 - val_loss: 0.1536 Epoch 20/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 335us/step - Recall: 0.8997 - loss: 0.3283 - val_Recall: 0.8964 - val_loss: 0.1518 Epoch 21/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 335us/step - Recall: 0.9292 - loss: 0.2760 - val_Recall: 0.9009 - val_loss: 0.1443 Epoch 22/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9120 - loss: 0.2945 - val_Recall: 0.8964 - val_loss: 0.1311 Epoch 23/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9244 - loss: 0.2810 - val_Recall: 0.8874 - val_loss: 0.1345 Epoch 24/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 333us/step - Recall: 0.9235 - loss: 0.2720 - val_Recall: 0.8874 - val_loss: 0.1137 Epoch 25/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 335us/step - Recall: 0.9248 - loss: 0.2619 - val_Recall: 0.8874 - val_loss: 0.1171 Epoch 26/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9047 - loss: 0.3394 - val_Recall: 0.8829 - val_loss: 0.0974 Epoch 27/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9174 - loss: 0.3080 - val_Recall: 0.8919 - val_loss: 0.1254 Epoch 28/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9265 - loss: 0.2291 - val_Recall: 0.8874 - val_loss: 0.1280 Epoch 29/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9206 - loss: 0.2962 - val_Recall: 0.9009 - val_loss: 0.1720 Epoch 30/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9164 - loss: 0.2872 - val_Recall: 0.8829 - val_loss: 0.1204 Epoch 31/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9116 - loss: 0.2929 - val_Recall: 0.8874 - val_loss: 0.1348 Epoch 32/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9229 - loss: 0.2596 - val_Recall: 0.8829 - val_loss: 0.0931 Epoch 33/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9050 - loss: 0.3151 - val_Recall: 0.8964 - val_loss: 0.1739 Epoch 34/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 337us/step - Recall: 0.9165 - loss: 0.2935 - val_Recall: 0.9009 - val_loss: 0.1323 Epoch 35/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 335us/step - Recall: 0.8965 - loss: 0.3071 - val_Recall: 0.8874 - val_loss: 0.1246 Epoch 36/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9258 - loss: 0.2533 - val_Recall: 0.8919 - val_loss: 0.1370 Epoch 37/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9064 - loss: 0.2946 - val_Recall: 0.8964 - val_loss: 0.1178 Epoch 38/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 335us/step - Recall: 0.9246 - loss: 0.2566 - val_Recall: 0.8964 - val_loss: 0.1050 Epoch 39/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9358 - loss: 0.2470 - val_Recall: 0.8919 - val_loss: 0.1314 Epoch 40/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9171 - loss: 0.2983 - val_Recall: 0.8964 - val_loss: 0.0950 Epoch 41/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 335us/step - Recall: 0.9225 - loss: 0.2524 - val_Recall: 0.8919 - val_loss: 0.1375 Epoch 42/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 335us/step - Recall: 0.9183 - loss: 0.2804 - val_Recall: 0.8874 - val_loss: 0.0967 Epoch 43/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9247 - loss: 0.2596 - val_Recall: 0.8874 - val_loss: 0.1181 Epoch 44/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 334us/step - Recall: 0.9189 - loss: 0.2602 - val_Recall: 0.8964 - val_loss: 0.1239 Epoch 45/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 335us/step - Recall: 0.9280 - loss: 0.2422 - val_Recall: 0.8919 - val_loss: 0.1009 Epoch 46/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 335us/step - Recall: 0.9253 - loss: 0.2393 - val_Recall: 0.8964 - val_loss: 0.1437 Epoch 47/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 333us/step - Recall: 0.9118 - loss: 0.2616 - val_Recall: 0.9009 - val_loss: 0.1289 Epoch 48/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 335us/step - Recall: 0.9249 - loss: 0.2536 - val_Recall: 0.9009 - val_loss: 0.1350 Epoch 49/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 335us/step - Recall: 0.9279 - loss: 0.2413 - val_Recall: 0.8964 - val_loss: 0.1693 Epoch 50/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 335us/step - Recall: 0.9356 - loss: 0.2391 - val_Recall: 0.8874 - val_loss: 0.1120
print("Time taken in seconds ",end-start)
Time taken in seconds 8.886943817138672
plot(history,'loss')
plot(history,'Recall')
model_5_train_perf = model_performance_classification(model_5, x_train, y_train) #threshold = 0.5
print("Model 5 Train Performance")
print(model_5_train_perf)
print("-"*50)
print("Model 5 Validation Performance")
model_5_val_perf = model_performance_classification(model_5,x_val,y_val)
print(model_5_val_perf)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 144us/step Model 5 Train Performance Accuracy Recall Precision F1 Score 0 0.9915 0.962642 0.956836 0.959718 -------------------------------------------------- Model 5 Validation Performance 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 164us/step Accuracy Recall Precision F1 Score 0 0.98825 0.940782 0.946466 0.943603
#let's get classification report on y train and y val for Model 5
y_train_pred_5 = model_5.predict(x_train)
y_val_pred_5 = model_5.predict(x_val)
print("Classification Report - Train data Model_5",end="\n\n")
cr_train_model_5 = classification_report(y_train,y_train_pred_5>0.5)
print(cr_train_model_5)
print("Classification Report - Validation data Model_5",end="\n\n")
cr_val_model_5 = classification_report(y_val,y_val_pred_5>0.5)
print(cr_val_model_5)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 150us/step 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 172us/step Classification Report - Train data Model_5 precision recall f1-score support 0 1.00 1.00 1.00 15112 1 0.92 0.93 0.92 888 accuracy 0.99 16000 macro avg 0.96 0.96 0.96 16000 weighted avg 0.99 0.99 0.99 16000 Classification Report - Validation data Model_5 precision recall f1-score support 0 0.99 0.99 0.99 3778 1 0.90 0.89 0.89 222 accuracy 0.99 4000 macro avg 0.95 0.94 0.94 4000 weighted avg 0.99 0.99 0.99 4000
# Predict probabilities using confusion matrix
y_pred = model_5.predict(x_val)
for i in range(len(y_val)):
if y_pred[i]>0.5:
y_pred[i]=1
else:
y_pred[i]=0
cm = confusion_matrix(y_val,y_pred)
labels = ['True Negative','False Positive','False Negative','True Positive']
make_confusion_matrix(cm,group_names=labels,cmap = "Blues")
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 174us/step
results.loc[5] = [3,[128,64,64],["relu","relu","relu"],50,32,"sgd",[0.01, "0.3"],"Xavier - CW ","dropout(0.3)",history.history["loss"][-1],history.history["val_loss"][-1],history.history["Recall"][-1],history.history["val_Recall"][-1],round(end-start,2)]
results
| # hidden layers | # neurons - hidden layer | activation function - hidden layer | # epochs | batch size | optimizer | learning rate, momentum | weight initializer | regularization | train loss | validation loss | train recall | validation recall | time (secs) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [128] | [relu] | 50 | 32 | [sgd] | [0.01, -] | - | - | 0.040534 | 0.059940 | 0.861486 | 0.851351 | 6.92 |
| 1 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | - | 0.028434 | 0.055182 | 0.902027 | 0.869369 | 7.70 |
| 2 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.5) | 0.050771 | 0.056083 | 0.834459 | 0.855856 | 8.08 |
| 3 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.3) | 0.038127 | 0.053664 | 0.880631 | 0.869369 | 8.56 |
| 4 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | Xavier - CW | dropout(0.3) | 0.245499 | 0.108515 | 0.925676 | 0.891892 | 8.46 |
| 5 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, 0.3] | Xavier - CW | dropout(0.3) | 0.257260 | 0.112013 | 0.921171 | 0.887387 | 8.89 |
Observations¶
- Applying momentum 0.3 does not yeild better recall than previous model.
- Train recall improves steadily and remains high (>0.92), but shows some fluctuation, especially in later epochs.
- Validation recall is more noisy and oscillates frequently around 0.88–0.89, indicating instability in learning generalizable patterns.
- Validation recall needs to be improved:
- We know SGD is very sensitive to learning rate with momentum and often causes oscillation, hence we will now switch to Adam Optimizer which adapts learning rate for each paramter and is more stable
Model 6¶
With Adam Optimizer
#clear current keras session
tf.keras.backend.clear_session()
# initializing the neural network
model_6 = Sequential()
# Adding a dense layer with ReLu activation function, with the input dimension determined by the shape of the training data.
model_6.add(Dense(128, activation = 'relu', input_dim = x_train.shape[1]))
model_6.add(Dropout(0.3)) # Adding 30% Dropout after first hidden layer
model_6.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_6.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_6.add(Dense(1,activation = 'sigmoid')) #adding output layer with sigmoid as the output is binary 1 or 0
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/core/dense.py:93: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
model_6.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 128) │ 5,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 64) │ 8,256 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 64) │ 4,160 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 17,729 (69.25 KB)
Trainable params: 17,729 (69.25 KB)
Non-trainable params: 0 (0.00 B)
#Adam Optiomizer
lr = 0.001
optimizer = keras.optimizers.Adam(learning_rate=lr) # defining SGD as the optimizer to be used
model_6.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["Recall"])
start = time.time()
history = model_6.fit(x_train, y_train, validation_data=(x_val,y_val) , batch_size=batch_size, epochs=epochs,class_weight=cw_dict)
end=time.time()
Epoch 1/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 501us/step - Recall: 0.7646 - loss: 0.8452 - val_Recall: 0.8784 - val_loss: 0.1619 Epoch 2/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.8737 - loss: 0.5518 - val_Recall: 0.8919 - val_loss: 0.1525 Epoch 3/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 372us/step - Recall: 0.9103 - loss: 0.4068 - val_Recall: 0.8784 - val_loss: 0.1448 Epoch 4/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 366us/step - Recall: 0.8930 - loss: 0.4071 - val_Recall: 0.8829 - val_loss: 0.1281 Epoch 5/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 365us/step - Recall: 0.9028 - loss: 0.3705 - val_Recall: 0.8874 - val_loss: 0.1164 Epoch 6/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 366us/step - Recall: 0.9094 - loss: 0.3289 - val_Recall: 0.8829 - val_loss: 0.1183 Epoch 7/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 370us/step - Recall: 0.9139 - loss: 0.3585 - val_Recall: 0.8874 - val_loss: 0.1834 Epoch 8/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 369us/step - Recall: 0.9016 - loss: 0.3636 - val_Recall: 0.8874 - val_loss: 0.0956 Epoch 9/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 369us/step - Recall: 0.9052 - loss: 0.3671 - val_Recall: 0.9099 - val_loss: 0.1792 Epoch 10/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 370us/step - Recall: 0.9219 - loss: 0.3126 - val_Recall: 0.9054 - val_loss: 0.1824 Epoch 11/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.9142 - loss: 0.3117 - val_Recall: 0.9099 - val_loss: 0.1883 Epoch 12/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 370us/step - Recall: 0.9020 - loss: 0.3718 - val_Recall: 0.8694 - val_loss: 0.0791 Epoch 13/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 370us/step - Recall: 0.9131 - loss: 0.3266 - val_Recall: 0.8874 - val_loss: 0.1047 Epoch 14/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9176 - loss: 0.2983 - val_Recall: 0.8874 - val_loss: 0.1280 Epoch 15/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 372us/step - Recall: 0.9060 - loss: 0.3272 - val_Recall: 0.8919 - val_loss: 0.1352 Epoch 16/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 370us/step - Recall: 0.9015 - loss: 0.3389 - val_Recall: 0.8919 - val_loss: 0.1153 Epoch 17/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 372us/step - Recall: 0.9174 - loss: 0.3103 - val_Recall: 0.8874 - val_loss: 0.1582 Epoch 18/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 372us/step - Recall: 0.9102 - loss: 0.2932 - val_Recall: 0.8784 - val_loss: 0.1577 Epoch 19/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.9103 - loss: 0.3013 - val_Recall: 0.8874 - val_loss: 0.1351 Epoch 20/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.9207 - loss: 0.2991 - val_Recall: 0.8964 - val_loss: 0.1587 Epoch 21/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 369us/step - Recall: 0.9155 - loss: 0.2699 - val_Recall: 0.8964 - val_loss: 0.1644 Epoch 22/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 370us/step - Recall: 0.9211 - loss: 0.2969 - val_Recall: 0.9009 - val_loss: 0.1517 Epoch 23/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.9195 - loss: 0.2960 - val_Recall: 0.8829 - val_loss: 0.0952 Epoch 24/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 370us/step - Recall: 0.9021 - loss: 0.3202 - val_Recall: 0.8964 - val_loss: 0.1688 Epoch 25/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 372us/step - Recall: 0.9312 - loss: 0.2878 - val_Recall: 0.8874 - val_loss: 0.1633 Epoch 26/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 372us/step - Recall: 0.9208 - loss: 0.2565 - val_Recall: 0.8829 - val_loss: 0.1511 Epoch 27/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.9142 - loss: 0.2922 - val_Recall: 0.8964 - val_loss: 0.1589 Epoch 28/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 370us/step - Recall: 0.9246 - loss: 0.2742 - val_Recall: 0.8874 - val_loss: 0.1098 Epoch 29/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.9109 - loss: 0.2806 - val_Recall: 0.8964 - val_loss: 0.1348 Epoch 30/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 370us/step - Recall: 0.9227 - loss: 0.2578 - val_Recall: 0.8874 - val_loss: 0.1126 Epoch 31/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 370us/step - Recall: 0.9291 - loss: 0.2629 - val_Recall: 0.8964 - val_loss: 0.1435 Epoch 32/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 422us/step - Recall: 0.9230 - loss: 0.2512 - val_Recall: 0.8919 - val_loss: 0.1644 Epoch 33/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 375us/step - Recall: 0.9298 - loss: 0.2407 - val_Recall: 0.8784 - val_loss: 0.1320 Epoch 34/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 372us/step - Recall: 0.9134 - loss: 0.2795 - val_Recall: 0.9009 - val_loss: 0.1485 Epoch 35/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 372us/step - Recall: 0.9311 - loss: 0.2173 - val_Recall: 0.8919 - val_loss: 0.1494 Epoch 36/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 372us/step - Recall: 0.9317 - loss: 0.2248 - val_Recall: 0.9009 - val_loss: 0.1513 Epoch 37/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 372us/step - Recall: 0.9185 - loss: 0.2493 - val_Recall: 0.8919 - val_loss: 0.1142 Epoch 38/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.9111 - loss: 0.2606 - val_Recall: 0.8874 - val_loss: 0.1717 Epoch 39/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.9423 - loss: 0.2089 - val_Recall: 0.9009 - val_loss: 0.2404 Epoch 40/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9326 - loss: 0.2409 - val_Recall: 0.8784 - val_loss: 0.1390 Epoch 41/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 372us/step - Recall: 0.9258 - loss: 0.2303 - val_Recall: 0.8874 - val_loss: 0.1295 Epoch 42/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 370us/step - Recall: 0.9176 - loss: 0.2367 - val_Recall: 0.8829 - val_loss: 0.1028 Epoch 43/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 372us/step - Recall: 0.9233 - loss: 0.2285 - val_Recall: 0.8874 - val_loss: 0.1509 Epoch 44/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 372us/step - Recall: 0.9160 - loss: 0.2343 - val_Recall: 0.8919 - val_loss: 0.1429 Epoch 45/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.9324 - loss: 0.2301 - val_Recall: 0.9009 - val_loss: 0.1330 Epoch 46/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.9304 - loss: 0.2121 - val_Recall: 0.8874 - val_loss: 0.1377 Epoch 47/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.9294 - loss: 0.2247 - val_Recall: 0.8784 - val_loss: 0.1667 Epoch 48/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.9309 - loss: 0.2240 - val_Recall: 0.8784 - val_loss: 0.1366 Epoch 49/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9225 - loss: 0.2176 - val_Recall: 0.8784 - val_loss: 0.1274 Epoch 50/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.9259 - loss: 0.2271 - val_Recall: 0.9009 - val_loss: 0.1431
print("Time taken in seconds ",end-start)
Time taken in seconds 9.925749063491821
plot(history,'loss')
plot(history,'Recall')
model_6_train_perf = model_performance_classification(model_6, x_train, y_train) #threshold = 0.5
print("Model 6 Train Performance")
print(model_6_train_perf)
print("-"*50)
print("Model 6 Validation Performance")
model_6_val_perf = model_performance_classification(model_6,x_val,y_val)
print(model_6_val_perf)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 149us/step Model 6 Train Performance Accuracy Recall Precision F1 Score 0 0.987125 0.966685 0.919863 0.941851 -------------------------------------------------- Model 6 Validation Performance 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 170us/step Accuracy Recall Precision F1 Score 0 0.9825 0.944098 0.900294 0.920884
#let's get classification report on y train and y val for Model 5
y_train_pred_6 = model_6.predict(x_train)
y_val_pred_6 = model_6.predict(x_val)
print("Classification Report - Train data Model_6",end="\n\n")
cr_train_model_6 = classification_report(y_train,y_train_pred_6>0.5)
print(cr_train_model_6)
print("Classification Report - Validation data Model_6",end="\n\n")
cr_val_model_6 = classification_report(y_val,y_val_pred_6>0.5)
print(cr_val_model_6)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 153us/step 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 171us/step Classification Report - Train data Model_6 precision recall f1-score support 0 1.00 0.99 0.99 15112 1 0.84 0.94 0.89 888 accuracy 0.99 16000 macro avg 0.92 0.97 0.94 16000 weighted avg 0.99 0.99 0.99 16000 Classification Report - Validation data Model_6 precision recall f1-score support 0 0.99 0.99 0.99 3778 1 0.81 0.90 0.85 222 accuracy 0.98 4000 macro avg 0.90 0.94 0.92 4000 weighted avg 0.98 0.98 0.98 4000
# Predict probabilities using confusion matrix
y_pred = model_6.predict(x_val)
for i in range(len(y_val)):
if y_pred[i]>0.5:
y_pred[i]=1
else:
y_pred[i]=0
cm = confusion_matrix(y_val,y_pred)
labels = ['True Negative','False Positive','False Negative','True Positive']
make_confusion_matrix(cm,group_names=labels,cmap = "Blues")
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 169us/step
results.loc[6] = [3,[128,64,64],["relu","relu","relu"],50,32,"adam",[0.001, "-"],"Xavier - CW ","dropout(0.3)",history.history["loss"][-1],history.history["val_loss"][-1],history.history["Recall"][-1],history.history["val_Recall"][-1],round(end-start,2)]
results
| # hidden layers | # neurons - hidden layer | activation function - hidden layer | # epochs | batch size | optimizer | learning rate, momentum | weight initializer | regularization | train loss | validation loss | train recall | validation recall | time (secs) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [128] | [relu] | 50 | 32 | [sgd] | [0.01, -] | - | - | 0.040534 | 0.059940 | 0.861486 | 0.851351 | 6.92 |
| 1 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | - | 0.028434 | 0.055182 | 0.902027 | 0.869369 | 7.70 |
| 2 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.5) | 0.050771 | 0.056083 | 0.834459 | 0.855856 | 8.08 |
| 3 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.3) | 0.038127 | 0.053664 | 0.880631 | 0.869369 | 8.56 |
| 4 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | Xavier - CW | dropout(0.3) | 0.245499 | 0.108515 | 0.925676 | 0.891892 | 8.46 |
| 5 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, 0.3] | Xavier - CW | dropout(0.3) | 0.257260 | 0.112013 | 0.921171 | 0.887387 | 8.89 |
| 6 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [0.001, -] | Xavier - CW | dropout(0.3) | 0.234939 | 0.143119 | 0.930180 | 0.900901 | 9.93 |
Observations¶
- Switching to Adam Optimizer with lr = 0.001 slightly improved a recall for both Train and Validation
- Oscillation still present, this means model is still struggling to find its local minima
- Adam adaptively learns the learning rate but remains dependent on its initial value, lets try decreasing the learning rate
Model 7¶
#clear current keras session
tf.keras.backend.clear_session()
# initializing the neural network
model_7 = Sequential()
# Adding a dense layer with ReLu activation function, with the input dimension determined by the shape of the training data.
model_7.add(Dense(128, activation = 'relu', input_dim = x_train.shape[1]))
model_7.add(Dropout(0.3)) # Adding 30% Dropout after first hidden layer
model_7.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_7.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_7.add(Dense(1,activation = 'sigmoid')) #adding output layer with sigmoid as the output is binary 1 or 0
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/core/dense.py:93: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
model_7.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 128) │ 5,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 64) │ 8,256 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 64) │ 4,160 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 17,729 (69.25 KB)
Trainable params: 17,729 (69.25 KB)
Non-trainable params: 0 (0.00 B)
#Adam Optiomizer
lr = 1e-4
optimizer = keras.optimizers.Adam(learning_rate=lr) # defining SGD as the optimizer to be used
model_7.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["Recall"])
start = time.time()
history = model_7.fit(x_train, y_train, validation_data=(x_val,y_val) , batch_size=batch_size, epochs=epochs,class_weight=cw_dict)
end=time.time()
Epoch 1/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 498us/step - Recall: 0.3989 - loss: 1.6584 - val_Recall: 0.8784 - val_loss: 0.3903 Epoch 2/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 379us/step - Recall: 0.8562 - loss: 0.7610 - val_Recall: 0.8919 - val_loss: 0.2993 Epoch 3/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 382us/step - Recall: 0.8832 - loss: 0.6718 - val_Recall: 0.8964 - val_loss: 0.2529 Epoch 4/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 378us/step - Recall: 0.8999 - loss: 0.5691 - val_Recall: 0.8874 - val_loss: 0.1933 Epoch 5/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.8809 - loss: 0.5674 - val_Recall: 0.8919 - val_loss: 0.1943 Epoch 6/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9027 - loss: 0.4760 - val_Recall: 0.8919 - val_loss: 0.1733 Epoch 7/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.8943 - loss: 0.5174 - val_Recall: 0.8919 - val_loss: 0.1634 Epoch 8/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.8815 - loss: 0.4944 - val_Recall: 0.8919 - val_loss: 0.1670 Epoch 9/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.9031 - loss: 0.4581 - val_Recall: 0.8919 - val_loss: 0.1480 Epoch 10/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 371us/step - Recall: 0.8910 - loss: 0.4551 - val_Recall: 0.8919 - val_loss: 0.1491 Epoch 11/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 375us/step - Recall: 0.9010 - loss: 0.4577 - val_Recall: 0.8874 - val_loss: 0.1221 Epoch 12/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9158 - loss: 0.4094 - val_Recall: 0.8919 - val_loss: 0.1534 Epoch 13/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9038 - loss: 0.3917 - val_Recall: 0.8919 - val_loss: 0.1349 Epoch 14/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.8975 - loss: 0.4176 - val_Recall: 0.8964 - val_loss: 0.1551 Epoch 15/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.8947 - loss: 0.4009 - val_Recall: 0.8919 - val_loss: 0.1415 Epoch 16/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9008 - loss: 0.3887 - val_Recall: 0.8874 - val_loss: 0.1311 Epoch 17/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 421us/step - Recall: 0.9086 - loss: 0.3637 - val_Recall: 0.8829 - val_loss: 0.1178 Epoch 18/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9045 - loss: 0.3845 - val_Recall: 0.8919 - val_loss: 0.1337 Epoch 19/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9064 - loss: 0.3667 - val_Recall: 0.8874 - val_loss: 0.1232 Epoch 20/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9162 - loss: 0.3517 - val_Recall: 0.8874 - val_loss: 0.1274 Epoch 21/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9079 - loss: 0.3685 - val_Recall: 0.8874 - val_loss: 0.1370 Epoch 22/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 375us/step - Recall: 0.8977 - loss: 0.3802 - val_Recall: 0.8874 - val_loss: 0.1040 Epoch 23/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 375us/step - Recall: 0.8822 - loss: 0.3937 - val_Recall: 0.8874 - val_loss: 0.1233 Epoch 24/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9220 - loss: 0.3266 - val_Recall: 0.8874 - val_loss: 0.1571 Epoch 25/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9001 - loss: 0.3871 - val_Recall: 0.8874 - val_loss: 0.1206 Epoch 26/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9272 - loss: 0.3059 - val_Recall: 0.8874 - val_loss: 0.1199 Epoch 27/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9263 - loss: 0.3016 - val_Recall: 0.8919 - val_loss: 0.1550 Epoch 28/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9185 - loss: 0.3121 - val_Recall: 0.8874 - val_loss: 0.1295 Epoch 29/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 377us/step - Recall: 0.9136 - loss: 0.3377 - val_Recall: 0.8874 - val_loss: 0.1168 Epoch 30/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 377us/step - Recall: 0.9109 - loss: 0.3414 - val_Recall: 0.8874 - val_loss: 0.1213 Epoch 31/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9189 - loss: 0.2864 - val_Recall: 0.8874 - val_loss: 0.1491 Epoch 32/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9274 - loss: 0.3012 - val_Recall: 0.8874 - val_loss: 0.1163 Epoch 33/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9213 - loss: 0.3107 - val_Recall: 0.8874 - val_loss: 0.1231 Epoch 34/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9168 - loss: 0.3201 - val_Recall: 0.8874 - val_loss: 0.1116 Epoch 35/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 375us/step - Recall: 0.9244 - loss: 0.3018 - val_Recall: 0.8874 - val_loss: 0.1135 Epoch 36/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9068 - loss: 0.3124 - val_Recall: 0.8874 - val_loss: 0.1249 Epoch 37/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9273 - loss: 0.2753 - val_Recall: 0.8874 - val_loss: 0.1270 Epoch 38/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9082 - loss: 0.3276 - val_Recall: 0.8874 - val_loss: 0.1137 Epoch 39/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9145 - loss: 0.3000 - val_Recall: 0.8874 - val_loss: 0.1052 Epoch 40/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9066 - loss: 0.3058 - val_Recall: 0.8919 - val_loss: 0.1441 Epoch 41/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 375us/step - Recall: 0.9301 - loss: 0.2712 - val_Recall: 0.8919 - val_loss: 0.1347 Epoch 42/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.9132 - loss: 0.2769 - val_Recall: 0.8874 - val_loss: 0.1232 Epoch 43/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 375us/step - Recall: 0.9145 - loss: 0.2735 - val_Recall: 0.8874 - val_loss: 0.1431 Epoch 44/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 373us/step - Recall: 0.9136 - loss: 0.3110 - val_Recall: 0.8874 - val_loss: 0.0922 Epoch 45/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9069 - loss: 0.3045 - val_Recall: 0.8874 - val_loss: 0.1270 Epoch 46/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9184 - loss: 0.2783 - val_Recall: 0.8874 - val_loss: 0.1225 Epoch 47/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 375us/step - Recall: 0.9277 - loss: 0.2828 - val_Recall: 0.8874 - val_loss: 0.1148 Epoch 48/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9272 - loss: 0.2515 - val_Recall: 0.8874 - val_loss: 0.1208 Epoch 49/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 372us/step - Recall: 0.9313 - loss: 0.2552 - val_Recall: 0.8874 - val_loss: 0.1319 Epoch 50/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.9165 - loss: 0.2655 - val_Recall: 0.8874 - val_loss: 0.1101
print("Time taken in seconds ",end-start)
Time taken in seconds 10.007179021835327
plot(history,'loss')
plot(history,'Recall')
model_7_train_perf = model_performance_classification(model_7, x_train, y_train) #threshold = 0.5
print("Model 7 Train Performance")
print(model_7_train_perf)
print("-"*50)
print("Model 7 Validation Performance")
model_7_val_perf = model_performance_classification(model_7,x_val,y_val)
print(model_7_val_perf)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 145us/step Model 7 Train Performance Accuracy Recall Precision F1 Score 0 0.989313 0.958304 0.941949 0.949956 -------------------------------------------------- Model 7 Validation Performance 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 167us/step Accuracy Recall Precision F1 Score 0 0.98725 0.940253 0.938395 0.939321
#let's get classification report on y train and y val for Model 7
y_train_pred_7 = model_7.predict(x_train)
y_val_pred_7 = model_7.predict(x_val)
print("Classification Report - Train data Model_7",end="\n\n")
cr_train_model_7 = classification_report(y_train,y_train_pred_7>0.5)
print(cr_train_model_7)
print("Classification Report - Validation data Model_7",end="\n\n")
cr_val_model_7 = classification_report(y_val,y_val_pred_7>0.5)
print(cr_val_model_7)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 148us/step 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 174us/step Classification Report - Train data Model_7 precision recall f1-score support 0 1.00 0.99 0.99 15112 1 0.89 0.92 0.91 888 accuracy 0.99 16000 macro avg 0.94 0.96 0.95 16000 weighted avg 0.99 0.99 0.99 16000 Classification Report - Validation data Model_7 precision recall f1-score support 0 0.99 0.99 0.99 3778 1 0.88 0.89 0.89 222 accuracy 0.99 4000 macro avg 0.94 0.94 0.94 4000 weighted avg 0.99 0.99 0.99 4000
# Predict probabilities using confusion matrix
y_pred = model_7.predict(x_val)
for i in range(len(y_val)):
if y_pred[i]>0.5:
y_pred[i]=1
else:
y_pred[i]=0
cm = confusion_matrix(y_val,y_pred)
labels = ['True Negative','False Positive','False Negative','True Positive']
make_confusion_matrix(cm,group_names=labels,cmap = "Blues")
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 168us/step
results.loc[7] = [3,[128,64,64],["relu","relu","relu"],50,32,"adam",[1e-04, "-"],"Xavier - CW ","dropout(0.3)",history.history["loss"][-1],history.history["val_loss"][-1],history.history["Recall"][-1],history.history["val_Recall"][-1],round(end-start,2)]
results
| # hidden layers | # neurons - hidden layer | activation function - hidden layer | # epochs | batch size | optimizer | learning rate, momentum | weight initializer | regularization | train loss | validation loss | train recall | validation recall | time (secs) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [128] | [relu] | 50 | 32 | [sgd] | [0.01, -] | - | - | 0.040534 | 0.059940 | 0.861486 | 0.851351 | 6.92 |
| 1 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | - | 0.028434 | 0.055182 | 0.902027 | 0.869369 | 7.70 |
| 2 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.5) | 0.050771 | 0.056083 | 0.834459 | 0.855856 | 8.08 |
| 3 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.3) | 0.038127 | 0.053664 | 0.880631 | 0.869369 | 8.56 |
| 4 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | Xavier - CW | dropout(0.3) | 0.245499 | 0.108515 | 0.925676 | 0.891892 | 8.46 |
| 5 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, 0.3] | Xavier - CW | dropout(0.3) | 0.257260 | 0.112013 | 0.921171 | 0.887387 | 8.89 |
| 6 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [0.001, -] | Xavier - CW | dropout(0.3) | 0.234939 | 0.143119 | 0.930180 | 0.900901 | 9.93 |
| 7 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [0.0001, -] | Xavier - CW | dropout(0.3) | 0.269328 | 0.110118 | 0.920045 | 0.887387 | 10.01 |
Observations¶
- Not a significant change in Recall for both train and validation an osciallation still persists, lets further decrease the learnign rate and see if it can reduce the oscillation
Model 8¶
#clear current keras session
tf.keras.backend.clear_session()
# initializing the neural network
model_8 = Sequential()
# Adding a dense layer with ReLu activation function, with the input dimension determined by the shape of the training data.
model_8.add(Dense(128, activation = 'relu', input_dim = x_train.shape[1]))
model_8.add(Dropout(0.3)) # Adding 30% Dropout after first hidden layer
model_8.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_8.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_8.add(Dense(1,activation = 'sigmoid')) #adding output layer with sigmoid as the output is binary 1 or 0
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/core/dense.py:93: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
model_8.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 128) │ 5,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 64) │ 8,256 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 64) │ 4,160 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 17,729 (69.25 KB)
Trainable params: 17,729 (69.25 KB)
Non-trainable params: 0 (0.00 B)
#Adam Optiomizer
lr = 1e-5
optimizer = keras.optimizers.Adam(learning_rate=lr) # defining SGD as the optimizer to be used
model_8.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["Recall"])
start = time.time()
history = model_8.fit(x_train, y_train, validation_data=(x_val,y_val) , batch_size=batch_size, epochs=epochs,class_weight=cw_dict)
end=time.time()
Epoch 1/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 499us/step - Recall: 0.8998 - loss: 1.6624 - val_Recall: 0.7613 - val_loss: 0.6448 Epoch 2/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.7767 - loss: 1.2090 - val_Recall: 0.7523 - val_loss: 0.4872 Epoch 3/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 377us/step - Recall: 0.7361 - loss: 1.1025 - val_Recall: 0.7658 - val_loss: 0.4323 Epoch 4/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 378us/step - Recall: 0.7975 - loss: 0.9568 - val_Recall: 0.7793 - val_loss: 0.4050 Epoch 5/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 377us/step - Recall: 0.8031 - loss: 0.9381 - val_Recall: 0.7928 - val_loss: 0.3812 Epoch 6/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 378us/step - Recall: 0.7670 - loss: 0.9312 - val_Recall: 0.8153 - val_loss: 0.3619 Epoch 7/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 377us/step - Recall: 0.8061 - loss: 0.8524 - val_Recall: 0.8198 - val_loss: 0.3483 Epoch 8/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 378us/step - Recall: 0.8294 - loss: 0.8025 - val_Recall: 0.8243 - val_loss: 0.3344 Epoch 9/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.8443 - loss: 0.8099 - val_Recall: 0.8423 - val_loss: 0.3213 Epoch 10/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.8341 - loss: 0.7757 - val_Recall: 0.8468 - val_loss: 0.3100 Epoch 11/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.8546 - loss: 0.8072 - val_Recall: 0.8468 - val_loss: 0.2928 Epoch 12/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.8867 - loss: 0.6780 - val_Recall: 0.8649 - val_loss: 0.2864 Epoch 13/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.8616 - loss: 0.6897 - val_Recall: 0.8649 - val_loss: 0.2751 Epoch 14/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.8588 - loss: 0.7310 - val_Recall: 0.8694 - val_loss: 0.2670 Epoch 15/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.8832 - loss: 0.6706 - val_Recall: 0.8784 - val_loss: 0.2614 Epoch 16/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 374us/step - Recall: 0.8702 - loss: 0.6993 - val_Recall: 0.8829 - val_loss: 0.2510 Epoch 17/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.8840 - loss: 0.6585 - val_Recall: 0.8829 - val_loss: 0.2490 Epoch 18/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 378us/step - Recall: 0.8713 - loss: 0.6453 - val_Recall: 0.8874 - val_loss: 0.2429 Epoch 19/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 381us/step - Recall: 0.8958 - loss: 0.6416 - val_Recall: 0.8829 - val_loss: 0.2392 Epoch 20/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 377us/step - Recall: 0.8577 - loss: 0.6586 - val_Recall: 0.8829 - val_loss: 0.2319 Epoch 21/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 377us/step - Recall: 0.8806 - loss: 0.5998 - val_Recall: 0.8829 - val_loss: 0.2281 Epoch 22/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.8830 - loss: 0.5773 - val_Recall: 0.8829 - val_loss: 0.2254 Epoch 23/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 377us/step - Recall: 0.8818 - loss: 0.6106 - val_Recall: 0.8829 - val_loss: 0.2200 Epoch 24/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 378us/step - Recall: 0.8698 - loss: 0.6071 - val_Recall: 0.8829 - val_loss: 0.2156 Epoch 25/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 378us/step - Recall: 0.8865 - loss: 0.5822 - val_Recall: 0.8829 - val_loss: 0.2102 Epoch 26/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.8685 - loss: 0.5977 - val_Recall: 0.8829 - val_loss: 0.2074 Epoch 27/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 375us/step - Recall: 0.8977 - loss: 0.5267 - val_Recall: 0.8874 - val_loss: 0.2054 Epoch 28/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 428us/step - Recall: 0.8972 - loss: 0.5788 - val_Recall: 0.8874 - val_loss: 0.1997 Epoch 29/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 380us/step - Recall: 0.8907 - loss: 0.5531 - val_Recall: 0.8874 - val_loss: 0.1963 Epoch 30/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.8763 - loss: 0.5732 - val_Recall: 0.8874 - val_loss: 0.1919 Epoch 31/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 375us/step - Recall: 0.8908 - loss: 0.5556 - val_Recall: 0.8874 - val_loss: 0.1914 Epoch 32/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.9036 - loss: 0.5317 - val_Recall: 0.8874 - val_loss: 0.1910 Epoch 33/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 377us/step - Recall: 0.9013 - loss: 0.5161 - val_Recall: 0.8874 - val_loss: 0.1893 Epoch 34/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.8837 - loss: 0.5157 - val_Recall: 0.8874 - val_loss: 0.1839 Epoch 35/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 375us/step - Recall: 0.8720 - loss: 0.5710 - val_Recall: 0.8874 - val_loss: 0.1829 Epoch 36/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 377us/step - Recall: 0.8659 - loss: 0.5846 - val_Recall: 0.8919 - val_loss: 0.1782 Epoch 37/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 379us/step - Recall: 0.8847 - loss: 0.5319 - val_Recall: 0.8919 - val_loss: 0.1785 Epoch 38/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 380us/step - Recall: 0.8958 - loss: 0.4959 - val_Recall: 0.8919 - val_loss: 0.1768 Epoch 39/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 379us/step - Recall: 0.8928 - loss: 0.5311 - val_Recall: 0.8919 - val_loss: 0.1722 Epoch 40/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 378us/step - Recall: 0.9001 - loss: 0.4968 - val_Recall: 0.8919 - val_loss: 0.1718 Epoch 41/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.8988 - loss: 0.5197 - val_Recall: 0.8919 - val_loss: 0.1702 Epoch 42/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 377us/step - Recall: 0.8733 - loss: 0.5366 - val_Recall: 0.8919 - val_loss: 0.1702 Epoch 43/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 381us/step - Recall: 0.8837 - loss: 0.5061 - val_Recall: 0.8919 - val_loss: 0.1693 Epoch 44/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.8732 - loss: 0.5262 - val_Recall: 0.8919 - val_loss: 0.1694 Epoch 45/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 377us/step - Recall: 0.8954 - loss: 0.5390 - val_Recall: 0.8919 - val_loss: 0.1656 Epoch 46/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 377us/step - Recall: 0.8954 - loss: 0.4729 - val_Recall: 0.8919 - val_loss: 0.1668 Epoch 47/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 377us/step - Recall: 0.9042 - loss: 0.4597 - val_Recall: 0.8919 - val_loss: 0.1643 Epoch 48/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 378us/step - Recall: 0.9104 - loss: 0.4624 - val_Recall: 0.8919 - val_loss: 0.1625 Epoch 49/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 376us/step - Recall: 0.8871 - loss: 0.5189 - val_Recall: 0.8919 - val_loss: 0.1619 Epoch 50/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 375us/step - Recall: 0.8759 - loss: 0.5361 - val_Recall: 0.8919 - val_loss: 0.1611
print("Time taken in seconds ",end-start)
Time taken in seconds 10.07526683807373
plot(history,'loss')
plot(history,'Recall')
model_8_train_perf = model_performance_classification(model_8, x_train, y_train) #threshold = 0.5
print("Model 8 Train Performance")
print(model_8_train_perf)
print("-"*50)
print("Model 8 Validation Performance")
model_8_val_perf = model_performance_classification(model_8,x_val,y_val)
print(model_8_val_perf)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 144us/step Model 8 Train Performance Accuracy Recall Precision F1 Score 0 0.971625 0.938341 0.840087 0.881904 -------------------------------------------------- Model 8 Validation Performance 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 166us/step Accuracy Recall Precision F1 Score 0 0.97 0.933241 0.833497 0.875704
#let's get classification report on y train and y val for Model 8
y_train_pred_8 = model_8.predict(x_train)
y_val_pred_8 = model_8.predict(x_val)
print("Classification Report - Train data Model_8",end="\n\n")
cr_train_model_8 = classification_report(y_train,y_train_pred_8>0.5)
print(cr_train_model_8)
print("Classification Report - Validation data Model_8",end="\n\n")
cr_val_model_8 = classification_report(y_val,y_val_pred_8>0.5)
print(cr_val_model_8)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 146us/step 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 167us/step Classification Report - Train data Model_8 precision recall f1-score support 0 0.99 0.98 0.98 15112 1 0.69 0.90 0.78 888 accuracy 0.97 16000 macro avg 0.84 0.94 0.88 16000 weighted avg 0.98 0.97 0.97 16000 Classification Report - Validation data Model_8 precision recall f1-score support 0 0.99 0.97 0.98 3778 1 0.67 0.89 0.77 222 accuracy 0.97 4000 macro avg 0.83 0.93 0.88 4000 weighted avg 0.98 0.97 0.97 4000
# Predict probabilities using confusion matrix
y_pred = model_8.predict(x_val)
for i in range(len(y_val)):
if y_pred[i]>0.5:
y_pred[i]=1
else:
y_pred[i]=0
cm = confusion_matrix(y_val,y_pred)
labels = ['True Negative','False Positive','False Negative','True Positive']
make_confusion_matrix(cm,group_names=labels,cmap = "Blues")
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 177us/step
results.loc[8] = [3,[128,64,64],["relu","relu","relu"],50,32,"adam",[1e-5, "-"],"Xavier - CW ","dropout(0.3)",history.history["loss"][-1],history.history["val_loss"][-1],history.history["Recall"][-1],history.history["val_Recall"][-1],round(end-start,2)]
results
| # hidden layers | # neurons - hidden layer | activation function - hidden layer | # epochs | batch size | optimizer | learning rate, momentum | weight initializer | regularization | train loss | validation loss | train recall | validation recall | time (secs) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [128] | [relu] | 50 | 32 | [sgd] | [0.01, -] | - | - | 0.040534 | 0.059940 | 0.861486 | 0.851351 | 6.92 |
| 1 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | - | 0.028434 | 0.055182 | 0.902027 | 0.869369 | 7.70 |
| 2 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.5) | 0.050771 | 0.056083 | 0.834459 | 0.855856 | 8.08 |
| 3 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.3) | 0.038127 | 0.053664 | 0.880631 | 0.869369 | 8.56 |
| 4 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | Xavier - CW | dropout(0.3) | 0.245499 | 0.108515 | 0.925676 | 0.891892 | 8.46 |
| 5 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, 0.3] | Xavier - CW | dropout(0.3) | 0.257260 | 0.112013 | 0.921171 | 0.887387 | 8.89 |
| 6 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [0.001, -] | Xavier - CW | dropout(0.3) | 0.234939 | 0.143119 | 0.930180 | 0.900901 | 9.93 |
| 7 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [0.0001, -] | Xavier - CW | dropout(0.3) | 0.269328 | 0.110118 | 0.920045 | 0.887387 | 10.01 |
| 8 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [1e-05, -] | Xavier - CW | dropout(0.3) | 0.486973 | 0.161076 | 0.888514 | 0.891892 | 10.08 |
Observations¶
- Switching to Adam Optimizer with lr = 1e-5 improved the recall oscillation much for validation model
- The Recall percaentage for both Train and validation is dropped but model is slightly overfit.
- As per the model recall vs Epoch graph, we see the Validation recall converge with training ~30 epochs and stays flat with ~0.87-0.88 recall.
- Lets try Batch Normalization and see how this helps in stablizing the neural netowrk for faster convergence
Model 9¶
#clear current keras session
tf.keras.backend.clear_session()
# initializing the neural network
model_9 = Sequential()
# Adding a dense layer with ReLu activation function, with the input dimension determined by the shape of the training data.
model_9.add(Dense(128, activation = 'relu', input_dim = x_train.shape[1]))
model_9.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_9.add(Dropout(0.3)) # Adding 30% Dropout after first hidden layer
model_9.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_9.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_9.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_9.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_9.add(Dense(1,activation = 'sigmoid')) #adding output layer with sigmoid as the output is binary 1 or 0
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/core/dense.py:93: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
model_9.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 128) │ 5,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization │ (None, 128) │ 512 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 64) │ 8,256 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_1 │ (None, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 64) │ 4,160 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_2 │ (None, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 18,753 (73.25 KB)
Trainable params: 18,241 (71.25 KB)
Non-trainable params: 512 (2.00 KB)
#Adam Optiomizer
lr = 1e-5
optimizer = keras.optimizers.Adam(learning_rate=lr) # defining SGD as the optimizer to be used
model_9.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["Recall"])
start = time.time()
history = model_9.fit(x_train, y_train, validation_data=(x_val,y_val) , batch_size=batch_size, epochs=epochs,class_weight=cw_dict)
end=time.time()
Epoch 1/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 623us/step - Recall: 0.3705 - loss: 2.4488 - val_Recall: 0.4775 - val_loss: 0.8437 Epoch 2/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 452us/step - Recall: 0.6144 - loss: 1.7520 - val_Recall: 0.7117 - val_loss: 0.7759 Epoch 3/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 455us/step - Recall: 0.6967 - loss: 1.4798 - val_Recall: 0.8288 - val_loss: 0.7253 Epoch 4/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 452us/step - Recall: 0.8097 - loss: 1.2481 - val_Recall: 0.8739 - val_loss: 0.6968 Epoch 5/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8398 - loss: 1.1383 - val_Recall: 0.8829 - val_loss: 0.6541 Epoch 6/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8642 - loss: 1.0938 - val_Recall: 0.8964 - val_loss: 0.6321 Epoch 7/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 451us/step - Recall: 0.8417 - loss: 1.0768 - val_Recall: 0.9144 - val_loss: 0.6081 Epoch 8/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8956 - loss: 1.0049 - val_Recall: 0.9189 - val_loss: 0.5836 Epoch 9/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 460us/step - Recall: 0.8832 - loss: 0.9862 - val_Recall: 0.9234 - val_loss: 0.5477 Epoch 10/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 459us/step - Recall: 0.8958 - loss: 0.9349 - val_Recall: 0.9189 - val_loss: 0.5392 Epoch 11/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 455us/step - Recall: 0.8961 - loss: 0.9025 - val_Recall: 0.9144 - val_loss: 0.5132 Epoch 12/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 458us/step - Recall: 0.9138 - loss: 0.8744 - val_Recall: 0.9144 - val_loss: 0.4941 Epoch 13/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 458us/step - Recall: 0.9064 - loss: 0.8623 - val_Recall: 0.9144 - val_loss: 0.4766 Epoch 14/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 458us/step - Recall: 0.8985 - loss: 0.8444 - val_Recall: 0.9099 - val_loss: 0.4534 Epoch 15/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.9213 - loss: 0.7858 - val_Recall: 0.9099 - val_loss: 0.4383 Epoch 16/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 461us/step - Recall: 0.9125 - loss: 0.7734 - val_Recall: 0.9144 - val_loss: 0.4266 Epoch 17/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 457us/step - Recall: 0.8866 - loss: 0.7988 - val_Recall: 0.9144 - val_loss: 0.4069 Epoch 18/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8878 - loss: 0.7764 - val_Recall: 0.9099 - val_loss: 0.4044 Epoch 19/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 456us/step - Recall: 0.8927 - loss: 0.7722 - val_Recall: 0.9144 - val_loss: 0.3876 Epoch 20/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8987 - loss: 0.7379 - val_Recall: 0.9009 - val_loss: 0.3718 Epoch 21/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.9045 - loss: 0.7177 - val_Recall: 0.9009 - val_loss: 0.3638 Epoch 22/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8921 - loss: 0.7127 - val_Recall: 0.8964 - val_loss: 0.3453 Epoch 23/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 456us/step - Recall: 0.8708 - loss: 0.7407 - val_Recall: 0.9009 - val_loss: 0.3495 Epoch 24/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8928 - loss: 0.6848 - val_Recall: 0.8964 - val_loss: 0.3318 Epoch 25/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8832 - loss: 0.6903 - val_Recall: 0.8919 - val_loss: 0.3217 Epoch 26/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8768 - loss: 0.7081 - val_Recall: 0.8964 - val_loss: 0.3138 Epoch 27/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8758 - loss: 0.7093 - val_Recall: 0.8964 - val_loss: 0.3057 Epoch 28/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8977 - loss: 0.6438 - val_Recall: 0.8964 - val_loss: 0.3012 Epoch 29/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 456us/step - Recall: 0.9095 - loss: 0.6377 - val_Recall: 0.8964 - val_loss: 0.2979 Epoch 30/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.9023 - loss: 0.6365 - val_Recall: 0.8919 - val_loss: 0.2764 Epoch 31/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 456us/step - Recall: 0.8847 - loss: 0.6575 - val_Recall: 0.8919 - val_loss: 0.2802 Epoch 32/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 512us/step - Recall: 0.8844 - loss: 0.6467 - val_Recall: 0.8919 - val_loss: 0.2726 Epoch 33/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 455us/step - Recall: 0.8883 - loss: 0.6207 - val_Recall: 0.8919 - val_loss: 0.2638 Epoch 34/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 456us/step - Recall: 0.9053 - loss: 0.5844 - val_Recall: 0.8919 - val_loss: 0.2585 Epoch 35/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 455us/step - Recall: 0.9145 - loss: 0.5755 - val_Recall: 0.8874 - val_loss: 0.2552 Epoch 36/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 455us/step - Recall: 0.8931 - loss: 0.5853 - val_Recall: 0.8919 - val_loss: 0.2567 Epoch 37/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8817 - loss: 0.5770 - val_Recall: 0.8829 - val_loss: 0.2419 Epoch 38/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8942 - loss: 0.6185 - val_Recall: 0.8874 - val_loss: 0.2426 Epoch 39/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 460us/step - Recall: 0.8979 - loss: 0.5712 - val_Recall: 0.8829 - val_loss: 0.2371 Epoch 40/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 466us/step - Recall: 0.8776 - loss: 0.6175 - val_Recall: 0.8829 - val_loss: 0.2314 Epoch 41/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8976 - loss: 0.5780 - val_Recall: 0.8829 - val_loss: 0.2285 Epoch 42/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8748 - loss: 0.6110 - val_Recall: 0.8829 - val_loss: 0.2231 Epoch 43/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 452us/step - Recall: 0.8651 - loss: 0.5844 - val_Recall: 0.8829 - val_loss: 0.2106 Epoch 44/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8822 - loss: 0.5651 - val_Recall: 0.8829 - val_loss: 0.2148 Epoch 45/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.9045 - loss: 0.5323 - val_Recall: 0.8829 - val_loss: 0.2109 Epoch 46/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8755 - loss: 0.5814 - val_Recall: 0.8829 - val_loss: 0.2056 Epoch 47/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8907 - loss: 0.5507 - val_Recall: 0.8829 - val_loss: 0.2069 Epoch 48/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8788 - loss: 0.5401 - val_Recall: 0.8829 - val_loss: 0.1983 Epoch 49/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 455us/step - Recall: 0.8882 - loss: 0.5583 - val_Recall: 0.8829 - val_loss: 0.1956 Epoch 50/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8789 - loss: 0.5341 - val_Recall: 0.8829 - val_loss: 0.1932
print("Time taken in seconds ",end-start)
Time taken in seconds 12.271375179290771
plot(history,'loss')
plot(history,'Recall')
model_9_train_perf = model_performance_classification(model_9, x_train, y_train) #threshold = 0.5
print("Model 8 Train Performance")
print(model_9_train_perf)
print("-"*50)
print("Model 9 Validation Performance")
model_9_val_perf = model_performance_classification(model_9,x_val,y_val)
print(model_9_val_perf)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 167us/step Model 8 Train Performance Accuracy Recall Precision F1 Score 0 0.968812 0.938972 0.826937 0.873239 -------------------------------------------------- Model 9 Validation Performance 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 180us/step Accuracy Recall Precision F1 Score 0 0.96425 0.925957 0.809573 0.856777
#let's get classification report on y train and y val for Model 9
y_train_pred_9 = model_9.predict(x_train)
y_val_pred_9 = model_9.predict(x_val)
print("Classification Report - Train data Model_9",end="\n\n")
cr_train_model_9 = classification_report(y_train,y_train_pred_9>0.5)
print(cr_train_model_9)
print("Classification Report - Validation data Model_9",end="\n\n")
cr_val_model_9 = classification_report(y_val,y_val_pred_9>0.5)
print(cr_val_model_9)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 165us/step 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 189us/step Classification Report - Train data Model_9 precision recall f1-score support 0 0.99 0.97 0.98 15112 1 0.66 0.91 0.76 888 accuracy 0.97 16000 macro avg 0.83 0.94 0.87 16000 weighted avg 0.98 0.97 0.97 16000 Classification Report - Validation data Model_9 precision recall f1-score support 0 0.99 0.97 0.98 3778 1 0.63 0.88 0.73 222 accuracy 0.96 4000 macro avg 0.81 0.93 0.86 4000 weighted avg 0.97 0.96 0.97 4000
# Predict probabilities using confusion matrix
y_pred = model_9.predict(x_val)
for i in range(len(y_val)):
if y_pred[i]>0.5:
y_pred[i]=1
else:
y_pred[i]=0
cm = confusion_matrix(y_val,y_pred)
labels = ['True Negative','False Positive','False Negative','True Positive']
make_confusion_matrix(cm,group_names=labels,cmap = "Blues")
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 189us/step
results.loc[9] = [3,[128,64,64],["relu","relu","relu"],50,32,"adam",[1e-5, "-"],"Xavier - CW ","dropout(0.3) + Batch Norm",history.history["loss"][-1],history.history["val_loss"][-1],history.history["Recall"][-1],history.history["val_Recall"][-1],round(end-start,2)]
results
| # hidden layers | # neurons - hidden layer | activation function - hidden layer | # epochs | batch size | optimizer | learning rate, momentum | weight initializer | regularization | train loss | validation loss | train recall | validation recall | time (secs) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [128] | [relu] | 50 | 32 | [sgd] | [0.01, -] | - | - | 0.040534 | 0.059940 | 0.861486 | 0.851351 | 6.92 |
| 1 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | - | 0.028434 | 0.055182 | 0.902027 | 0.869369 | 7.70 |
| 2 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.5) | 0.050771 | 0.056083 | 0.834459 | 0.855856 | 8.08 |
| 3 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.3) | 0.038127 | 0.053664 | 0.880631 | 0.869369 | 8.56 |
| 4 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | Xavier - CW | dropout(0.3) | 0.245499 | 0.108515 | 0.925676 | 0.891892 | 8.46 |
| 5 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, 0.3] | Xavier - CW | dropout(0.3) | 0.257260 | 0.112013 | 0.921171 | 0.887387 | 8.89 |
| 6 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [0.001, -] | Xavier - CW | dropout(0.3) | 0.234939 | 0.143119 | 0.930180 | 0.900901 | 9.93 |
| 7 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [0.0001, -] | Xavier - CW | dropout(0.3) | 0.269328 | 0.110118 | 0.920045 | 0.887387 | 10.01 |
| 8 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [1e-05, -] | Xavier - CW | dropout(0.3) | 0.486973 | 0.161076 | 0.888514 | 0.891892 | 10.08 |
| 9 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [1e-05, -] | Xavier - CW | dropout(0.3) + Batch Norm | 0.541462 | 0.193208 | 0.881757 | 0.882883 | 12.27 |
Observations¶
- Using Batch Norm regularization technique , both train and validation recall are ~90%
- Convergence also occured in early Epochs and near 50th Epochs graphs shows trend on converging
- The Learning time has although increased and this is due to applying batch nrom after every dense layer .
- Before we conclude, lets try to use a He_normal Weight initializer for the above model and see the outcomes
Model 10¶
#clear current keras session
tf.keras.backend.clear_session()
# initializing the neural network
model_10 = Sequential()
# Adding a dense layer with ReLu activation function, with the input dimension determined by the shape of the training data.
model_10.add(Dense(128, activation = 'relu', input_dim = x_train.shape[1],kernel_initializer='he_normal'))
model_10.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_10.add(Dropout(0.3)) # Adding 30% Dropout after first hidden layer
model_10.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_10.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_10.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_10.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_10.add(Dense(1,activation = 'sigmoid')) #adding output layer with sigmoid as the output is binary 1 or 0
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/core/dense.py:93: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
model_10.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 128) │ 5,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization │ (None, 128) │ 512 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 64) │ 8,256 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_1 │ (None, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 64) │ 4,160 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_2 │ (None, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 18,753 (73.25 KB)
Trainable params: 18,241 (71.25 KB)
Non-trainable params: 512 (2.00 KB)
#Adam Optiomizer
lr = 1e-5
optimizer = keras.optimizers.Adam(learning_rate=lr) # defining SGD as the optimizer to be used
model_10.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["Recall"])
start = time.time()
history = model_10.fit(x_train, y_train, validation_data=(x_val,y_val) , batch_size=batch_size, epochs=epochs,class_weight=cw_dict)
end=time.time()
Epoch 1/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 625us/step - Recall: 0.5208 - loss: 1.6560 - val_Recall: 0.7072 - val_loss: 0.6930 Epoch 2/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 455us/step - Recall: 0.6555 - loss: 1.4446 - val_Recall: 0.7793 - val_loss: 0.6734 Epoch 3/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.7235 - loss: 1.2980 - val_Recall: 0.8288 - val_loss: 0.6470 Epoch 4/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 455us/step - Recall: 0.7922 - loss: 1.2102 - val_Recall: 0.8514 - val_loss: 0.6222 Epoch 5/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8333 - loss: 1.1184 - val_Recall: 0.8559 - val_loss: 0.6029 Epoch 6/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 452us/step - Recall: 0.8218 - loss: 1.1295 - val_Recall: 0.8739 - val_loss: 0.5852 Epoch 7/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8698 - loss: 1.0220 - val_Recall: 0.8739 - val_loss: 0.5707 Epoch 8/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 455us/step - Recall: 0.8829 - loss: 0.9907 - val_Recall: 0.8739 - val_loss: 0.5471 Epoch 9/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8560 - loss: 0.9997 - val_Recall: 0.8694 - val_loss: 0.5312 Epoch 10/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8787 - loss: 0.9429 - val_Recall: 0.8739 - val_loss: 0.5141 Epoch 11/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8703 - loss: 0.9291 - val_Recall: 0.8784 - val_loss: 0.5020 Epoch 12/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8714 - loss: 0.9330 - val_Recall: 0.8829 - val_loss: 0.4825 Epoch 13/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8838 - loss: 0.8602 - val_Recall: 0.8829 - val_loss: 0.4723 Epoch 14/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 451us/step - Recall: 0.8839 - loss: 0.8981 - val_Recall: 0.8874 - val_loss: 0.4437 Epoch 15/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.9023 - loss: 0.8260 - val_Recall: 0.8964 - val_loss: 0.4341 Epoch 16/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8870 - loss: 0.8222 - val_Recall: 0.8964 - val_loss: 0.4263 Epoch 17/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 452us/step - Recall: 0.9140 - loss: 0.7955 - val_Recall: 0.8964 - val_loss: 0.4158 Epoch 18/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8952 - loss: 0.8089 - val_Recall: 0.8964 - val_loss: 0.4046 Epoch 19/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 452us/step - Recall: 0.8940 - loss: 0.8068 - val_Recall: 0.8964 - val_loss: 0.3904 Epoch 20/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 452us/step - Recall: 0.8915 - loss: 0.7812 - val_Recall: 0.8919 - val_loss: 0.3787 Epoch 21/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8959 - loss: 0.7482 - val_Recall: 0.8919 - val_loss: 0.3729 Epoch 22/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 452us/step - Recall: 0.8632 - loss: 0.7844 - val_Recall: 0.8829 - val_loss: 0.3577 Epoch 23/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 452us/step - Recall: 0.9016 - loss: 0.7322 - val_Recall: 0.8829 - val_loss: 0.3509 Epoch 24/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.9060 - loss: 0.6965 - val_Recall: 0.8829 - val_loss: 0.3424 Epoch 25/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 452us/step - Recall: 0.8834 - loss: 0.7620 - val_Recall: 0.8829 - val_loss: 0.3328 Epoch 26/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8833 - loss: 0.7320 - val_Recall: 0.8829 - val_loss: 0.3228 Epoch 27/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8840 - loss: 0.7153 - val_Recall: 0.8829 - val_loss: 0.3145 Epoch 28/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8822 - loss: 0.6880 - val_Recall: 0.8829 - val_loss: 0.3114 Epoch 29/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8797 - loss: 0.6876 - val_Recall: 0.8829 - val_loss: 0.3049 Epoch 30/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8823 - loss: 0.7150 - val_Recall: 0.8829 - val_loss: 0.2954 Epoch 31/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.9004 - loss: 0.6508 - val_Recall: 0.8829 - val_loss: 0.2879 Epoch 32/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8670 - loss: 0.6666 - val_Recall: 0.8829 - val_loss: 0.2770 Epoch 33/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8993 - loss: 0.6370 - val_Recall: 0.8829 - val_loss: 0.2771 Epoch 34/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.9002 - loss: 0.6169 - val_Recall: 0.8829 - val_loss: 0.2689 Epoch 35/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 504us/step - Recall: 0.8858 - loss: 0.6286 - val_Recall: 0.8829 - val_loss: 0.2621 Epoch 36/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8924 - loss: 0.6171 - val_Recall: 0.8829 - val_loss: 0.2583 Epoch 37/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 455us/step - Recall: 0.8681 - loss: 0.6645 - val_Recall: 0.8829 - val_loss: 0.2538 Epoch 38/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8849 - loss: 0.6055 - val_Recall: 0.8829 - val_loss: 0.2500 Epoch 39/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8802 - loss: 0.6158 - val_Recall: 0.8829 - val_loss: 0.2466 Epoch 40/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.9197 - loss: 0.5482 - val_Recall: 0.8784 - val_loss: 0.2395 Epoch 41/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8956 - loss: 0.6082 - val_Recall: 0.8784 - val_loss: 0.2434 Epoch 42/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8924 - loss: 0.5944 - val_Recall: 0.8784 - val_loss: 0.2354 Epoch 43/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8889 - loss: 0.5980 - val_Recall: 0.8784 - val_loss: 0.2261 Epoch 44/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8958 - loss: 0.5788 - val_Recall: 0.8784 - val_loss: 0.2208 Epoch 45/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8788 - loss: 0.5873 - val_Recall: 0.8829 - val_loss: 0.2256 Epoch 46/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 455us/step - Recall: 0.8739 - loss: 0.5943 - val_Recall: 0.8829 - val_loss: 0.2173 Epoch 47/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 458us/step - Recall: 0.8926 - loss: 0.5930 - val_Recall: 0.8829 - val_loss: 0.2178 Epoch 48/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8913 - loss: 0.5674 - val_Recall: 0.8829 - val_loss: 0.2184 Epoch 49/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8956 - loss: 0.5625 - val_Recall: 0.8829 - val_loss: 0.2165 Epoch 50/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8897 - loss: 0.5513 - val_Recall: 0.8829 - val_loss: 0.2091
print("Time taken in seconds ",end-start)
Time taken in seconds 12.212008953094482
plot(history,'loss')
plot(history,'Recall')
model_10_train_perf = model_performance_classification(model_10, x_train, y_train) #threshold = 0.5
print("Model 10 Train Performance")
print(model_10_train_perf)
print("-"*50)
print("Model 10 Validation Performance")
model_10_val_perf = model_performance_classification(model_10,x_val,y_val)
print(model_10_val_perf)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 168us/step Model 10 Train Performance Accuracy Recall Precision F1 Score 0 0.956187 0.930699 0.780056 0.836005 -------------------------------------------------- Model 10 Validation Performance 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 182us/step Accuracy Recall Precision F1 Score 0 0.95275 0.919869 0.76941 0.824612
#let's get classification report on y train and y val for Model 10
y_train_pred_10 = model_10.predict(x_train)
y_val_pred_10 = model_10.predict(x_val)
print("Classification Report - Train data Model_10",end="\n\n")
cr_train_model_10 = classification_report(y_train,y_train_pred_10>0.5)
print(cr_train_model_10)
print("Classification Report - Validation data Model_10",end="\n\n")
cr_val_model_10 = classification_report(y_val,y_val_pred_10>0.5)
print(cr_val_model_10)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 164us/step 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 185us/step Classification Report - Train data Model_10 precision recall f1-score support 0 0.99 0.96 0.98 15112 1 0.57 0.90 0.70 888 accuracy 0.96 16000 macro avg 0.78 0.93 0.84 16000 weighted avg 0.97 0.96 0.96 16000 Classification Report - Validation data Model_10 precision recall f1-score support 0 0.99 0.96 0.97 3778 1 0.55 0.88 0.67 222 accuracy 0.95 4000 macro avg 0.77 0.92 0.82 4000 weighted avg 0.97 0.95 0.96 4000
# Predict probabilities using confusion matrix
y_pred = model_10.predict(x_val)
for i in range(len(y_val)):
if y_pred[i]>0.5:
y_pred[i]=1
else:
y_pred[i]=0
cm = confusion_matrix(y_val,y_pred)
labels = ['True Negative','False Positive','False Negative','True Positive']
make_confusion_matrix(cm,group_names=labels,cmap = "Blues")
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 184us/step
results.loc[10] = [3,[128,64,64],["relu","relu","relu"],50,32,"adam",[1e-5, "-"],"He Normal - CW ","dropout(0.3) + Batch Norm",history.history["loss"][-1],history.history["val_loss"][-1],history.history["Recall"][-1],history.history["val_Recall"][-1],round(end-start,2)]
results
| # hidden layers | # neurons - hidden layer | activation function - hidden layer | # epochs | batch size | optimizer | learning rate, momentum | weight initializer | regularization | train loss | validation loss | train recall | validation recall | time (secs) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [128] | [relu] | 50 | 32 | [sgd] | [0.01, -] | - | - | 0.040534 | 0.059940 | 0.861486 | 0.851351 | 6.92 |
| 1 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | - | 0.028434 | 0.055182 | 0.902027 | 0.869369 | 7.70 |
| 2 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.5) | 0.050771 | 0.056083 | 0.834459 | 0.855856 | 8.08 |
| 3 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.3) | 0.038127 | 0.053664 | 0.880631 | 0.869369 | 8.56 |
| 4 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | Xavier - CW | dropout(0.3) | 0.245499 | 0.108515 | 0.925676 | 0.891892 | 8.46 |
| 5 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, 0.3] | Xavier - CW | dropout(0.3) | 0.257260 | 0.112013 | 0.921171 | 0.887387 | 8.89 |
| 6 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [0.001, -] | Xavier - CW | dropout(0.3) | 0.234939 | 0.143119 | 0.930180 | 0.900901 | 9.93 |
| 7 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [0.0001, -] | Xavier - CW | dropout(0.3) | 0.269328 | 0.110118 | 0.920045 | 0.887387 | 10.01 |
| 8 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [1e-05, -] | Xavier - CW | dropout(0.3) | 0.486973 | 0.161076 | 0.888514 | 0.891892 | 10.08 |
| 9 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [1e-05, -] | Xavier - CW | dropout(0.3) + Batch Norm | 0.541462 | 0.193208 | 0.881757 | 0.882883 | 12.27 |
| 10 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [1e-05, -] | He Normal - CW | dropout(0.3) + Batch Norm | 0.561231 | 0.209055 | 0.881757 | 0.882883 | 12.21 |
Observations¶
- Posting switching to he_normal weight initializer, we see not a significant change in recall compared to Model 9
- Lets try Model 9 with callbacks :
- Using Learning rate scheduler : this is to reduce the LR when performance plateaus
- Using early stopping : this is avoid unnecessary overfitting and oscillation in later epochs
Model 11 - reduce learning rate and early stopping¶
#clear current keras session
tf.keras.backend.clear_session()
# initializing the neural network
#chanig the weight initializer back to Xavier or glorot_normal
model_11 = Sequential()
# Adding a dense layer with ReLu activation function, with the input dimension determined by the shape of the training data.
model_11.add(Dense(128, activation = 'relu', input_dim = x_train.shape[1],kernel_initializer='glorot_normal'))
model_11.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_11.add(Dropout(0.3)) # Adding 30% Dropout after first hidden layer
model_11.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_11.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_11.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_11.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_11.add(Dense(1,activation = 'sigmoid')) #adding output layer with sigmoid as the output is binary 1 or 0
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/core/dense.py:93: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
model_11.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 128) │ 5,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization │ (None, 128) │ 512 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 64) │ 8,256 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_1 │ (None, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 64) │ 4,160 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_2 │ (None, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 18,753 (73.25 KB)
Trainable params: 18,241 (71.25 KB)
Non-trainable params: 512 (2.00 KB)
#Adam Optiomizer
lr = 1e-5
optimizer = keras.optimizers.Adam(learning_rate=lr) # defining SGD as the optimizer to be used
model_11.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["Recall"])
#callbacks to reduce osciallation and overfitting
from keras.callbacks import ReduceLROnPlateau, EarlyStopping
lr_schedule = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, verbose=1, min_lr=1e-6) #learnign schedule
early_stop = EarlyStopping(monitor='val_Recall',mode='max', patience=10, restore_best_weights=True) #early stopping
start = time.time()
history = model_11.fit(x_train, y_train, validation_data=(x_val,y_val) , batch_size=batch_size,
epochs=epochs,class_weight=cw_dict,callbacks=[lr_schedule,early_stop])
end=time.time()
Epoch 1/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 1s 619us/step - Recall: 0.2812 - loss: 2.2647 - val_Recall: 0.3018 - val_loss: 0.7829 - learning_rate: 1.0000e-05 Epoch 2/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.5273 - loss: 1.7191 - val_Recall: 0.5495 - val_loss: 0.7215 - learning_rate: 1.0000e-05 Epoch 3/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 455us/step - Recall: 0.6522 - loss: 1.5047 - val_Recall: 0.6982 - val_loss: 0.6891 - learning_rate: 1.0000e-05 Epoch 4/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.7554 - loss: 1.2881 - val_Recall: 0.8153 - val_loss: 0.6747 - learning_rate: 1.0000e-05 Epoch 5/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 458us/step - Recall: 0.7987 - loss: 1.2433 - val_Recall: 0.8514 - val_loss: 0.6460 - learning_rate: 1.0000e-05 Epoch 6/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 458us/step - Recall: 0.8683 - loss: 1.0970 - val_Recall: 0.8874 - val_loss: 0.6231 - learning_rate: 1.0000e-05 Epoch 7/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8482 - loss: 1.1173 - val_Recall: 0.8829 - val_loss: 0.6000 - learning_rate: 1.0000e-05 Epoch 8/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 457us/step - Recall: 0.8890 - loss: 1.0433 - val_Recall: 0.8874 - val_loss: 0.5829 - learning_rate: 1.0000e-05 Epoch 9/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.8972 - loss: 0.9818 - val_Recall: 0.8874 - val_loss: 0.5617 - learning_rate: 1.0000e-05 Epoch 10/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.9007 - loss: 0.9742 - val_Recall: 0.8874 - val_loss: 0.5493 - learning_rate: 1.0000e-05 Epoch 11/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 451us/step - Recall: 0.9083 - loss: 0.9161 - val_Recall: 0.8919 - val_loss: 0.5216 - learning_rate: 1.0000e-05 Epoch 12/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 451us/step - Recall: 0.9262 - loss: 0.8664 - val_Recall: 0.8919 - val_loss: 0.5076 - learning_rate: 1.0000e-05 Epoch 13/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 452us/step - Recall: 0.9011 - loss: 0.8914 - val_Recall: 0.8919 - val_loss: 0.4808 - learning_rate: 1.0000e-05 Epoch 14/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 455us/step - Recall: 0.9070 - loss: 0.8500 - val_Recall: 0.8964 - val_loss: 0.4771 - learning_rate: 1.0000e-05 Epoch 15/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.9010 - loss: 0.8513 - val_Recall: 0.8874 - val_loss: 0.4464 - learning_rate: 1.0000e-05 Epoch 16/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 456us/step - Recall: 0.8949 - loss: 0.8204 - val_Recall: 0.8874 - val_loss: 0.4390 - learning_rate: 1.0000e-05 Epoch 17/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 460us/step - Recall: 0.9004 - loss: 0.7995 - val_Recall: 0.8919 - val_loss: 0.4279 - learning_rate: 1.0000e-05 Epoch 18/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 463us/step - Recall: 0.9107 - loss: 0.7849 - val_Recall: 0.8874 - val_loss: 0.4061 - learning_rate: 1.0000e-05 Epoch 19/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 458us/step - Recall: 0.8903 - loss: 0.7941 - val_Recall: 0.8874 - val_loss: 0.3973 - learning_rate: 1.0000e-05 Epoch 20/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 454us/step - Recall: 0.8880 - loss: 0.7630 - val_Recall: 0.8829 - val_loss: 0.3757 - learning_rate: 1.0000e-05 Epoch 21/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 453us/step - Recall: 0.9072 - loss: 0.7247 - val_Recall: 0.8829 - val_loss: 0.3673 - learning_rate: 1.0000e-05 Epoch 22/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 459us/step - Recall: 0.8999 - loss: 0.7403 - val_Recall: 0.8829 - val_loss: 0.3692 - learning_rate: 1.0000e-05 Epoch 23/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 457us/step - Recall: 0.8900 - loss: 0.7377 - val_Recall: 0.8829 - val_loss: 0.3512 - learning_rate: 1.0000e-05 Epoch 24/50 500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 459us/step - Recall: 0.8983 - loss: 0.7156 - val_Recall: 0.8829 - val_loss: 0.3378 - learning_rate: 1.0000e-05
print("Time taken in seconds ",end-start)
Time taken in seconds 6.189470052719116
plot(history,'loss')
plot(history,'Recall')
model_11_train_perf = model_performance_classification(model_11, x_train, y_train) #threshold = 0.5
print("Model 11 Train Performance")
print(model_11_train_perf)
print("-"*50)
print("Model 11 Validation Performance")
model_11_val_perf = model_performance_classification(model_11,x_val,y_val)
print(model_11_val_perf)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 169us/step Model 11 Train Performance Accuracy Recall Precision F1 Score 0 0.844375 0.875747 0.62253 0.652296 -------------------------------------------------- Model 11 Validation Performance 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 180us/step Accuracy Recall Precision F1 Score 0 0.837 0.864953 0.616544 0.642617
#let's get classification report on y train and y val for Model 11
y_train_pred_11 = model_11.predict(x_train)
y_val_pred_11 = model_11.predict(x_val)
print("Classification Report - Train data Model_11",end="\n\n")
cr_train_model_11 = classification_report(y_train,y_train_pred_11>0.5)
print(cr_train_model_11)
print("Classification Report - Validation data Model_11",end="\n\n")
cr_val_model_11 = classification_report(y_val,y_val_pred_11>0.5)
print(cr_val_model_11)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 167us/step 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 187us/step Classification Report - Train data Model_11 precision recall f1-score support 0 0.99 0.84 0.91 15112 1 0.25 0.91 0.39 888 accuracy 0.84 16000 macro avg 0.62 0.88 0.65 16000 weighted avg 0.95 0.84 0.88 16000 Classification Report - Validation data Model_11 precision recall f1-score support 0 0.99 0.83 0.91 3778 1 0.24 0.90 0.38 222 accuracy 0.84 4000 macro avg 0.62 0.86 0.64 4000 weighted avg 0.95 0.84 0.88 4000
# Predict probabilities using confusion matrix
y_pred = model_11.predict(x_val)
for i in range(len(y_val)):
if y_pred[i]>0.5:
y_pred[i]=1
else:
y_pred[i]=0
cm = confusion_matrix(y_val,y_pred)
labels = ['True Negative','False Positive','False Negative','True Positive']
make_confusion_matrix(cm,group_names=labels,cmap = "Blues")
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 190us/step
results.loc[11] = [3,[128,64,64],["relu","relu","relu"],50,32,"adam",[1e-5, "-"],"xavier - CW ","dropout(0.3) + Batch Norm + Callbacks[lr_schedule, early_stop]",history.history["loss"][-1],history.history["val_loss"][-1],history.history["Recall"][-1],history.history["val_Recall"][-1],round(end-start,2)]
results
| # hidden layers | # neurons - hidden layer | activation function - hidden layer | # epochs | batch size | optimizer | learning rate, momentum | weight initializer | regularization | train loss | validation loss | train recall | validation recall | time (secs) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [128] | [relu] | 50 | 32 | [sgd] | [0.01, -] | - | - | 0.040534 | 0.059940 | 0.861486 | 0.851351 | 6.92 |
| 1 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | - | 0.028434 | 0.055182 | 0.902027 | 0.869369 | 7.70 |
| 2 | 2 | [128, 64] | [relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.5) | 0.050771 | 0.056083 | 0.834459 | 0.855856 | 8.08 |
| 3 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | - | dropout(0.3) | 0.038127 | 0.053664 | 0.880631 | 0.869369 | 8.56 |
| 4 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, -] | Xavier - CW | dropout(0.3) | 0.245499 | 0.108515 | 0.925676 | 0.891892 | 8.46 |
| 5 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | sgd | [0.01, 0.3] | Xavier - CW | dropout(0.3) | 0.257260 | 0.112013 | 0.921171 | 0.887387 | 8.89 |
| 6 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [0.001, -] | Xavier - CW | dropout(0.3) | 0.234939 | 0.143119 | 0.930180 | 0.900901 | 9.93 |
| 7 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [0.0001, -] | Xavier - CW | dropout(0.3) | 0.269328 | 0.110118 | 0.920045 | 0.887387 | 10.01 |
| 8 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [1e-05, -] | Xavier - CW | dropout(0.3) | 0.486973 | 0.161076 | 0.888514 | 0.891892 | 10.08 |
| 9 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [1e-05, -] | Xavier - CW | dropout(0.3) + Batch Norm | 0.541462 | 0.193208 | 0.881757 | 0.882883 | 12.27 |
| 10 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [1e-05, -] | He Normal - CW | dropout(0.3) + Batch Norm | 0.561231 | 0.209055 | 0.881757 | 0.882883 | 12.21 |
| 11 | 3 | [128, 64, 64] | [relu, relu, relu] | 50 | 32 | adam | [1e-05, -] | xavier - CW | dropout(0.3) + Batch Norm + Callbacks[lr_sched... | 0.709747 | 0.337831 | 0.899775 | 0.882883 | 6.19 |
Observations¶
- Posting switching to he_normal weight initializer, we see not a significant change in recall compared to Model 9
- Lets try Model 9 with callbacks :
- Using Learning rate scheduler : this is to reduce the LR when performance plateaus
- Using early stopping : this is avoid unnecessary overfitting and oscillation in later epochs
Model 12 - reduce learning rate¶
#clear current keras session
tf.keras.backend.clear_session()
# initializing the neural network
#chanig the weight initializer back to Xavier or glorot_normal
model_12 = Sequential()
# Adding a dense layer with ReLu activation function, with the input dimension determined by the shape of the training data.
model_12.add(Dense(128, activation = 'relu', input_dim = x_train.shape[1],kernel_initializer='glorot_normal'))
model_12.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_12.add(Dropout(0.3)) # Adding 30% Dropout after first hidden layer
model_12.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_12.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_12.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_12.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_12.add(Dense(1,activation = 'sigmoid')) #adding output layer with sigmoid as the output is binary 1 or 0
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/core/dense.py:93: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
model_12.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 128) │ 5,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization │ (None, 128) │ 512 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 64) │ 8,256 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_1 │ (None, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 64) │ 4,160 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_2 │ (None, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 18,753 (73.25 KB)
Trainable params: 18,241 (71.25 KB)
Non-trainable params: 512 (2.00 KB)
#Adam Optiomizer
lr = 1e-5
optimizer = keras.optimizers.Adam(learning_rate=lr) # defining SGD as the optimizer to be used
model_12.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["Recall"])
#callbacks to reduce osciallation and overfitting
lr_schedule = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, verbose=1, min_lr=1e-6) #learnign schedule
start = time.time()
history = model_12.fit(x_train, y_train, validation_data=(x_val,y_val) , batch_size=64,
epochs=epochs,class_weight=cw_dict,callbacks=[lr_schedule])
end=time.time()
Epoch 1/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 879us/step - Recall: 0.5377 - loss: 1.9114 - val_Recall: 0.6937 - val_loss: 0.8911 - learning_rate: 1.0000e-05 Epoch 2/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 552us/step - Recall: 0.6536 - loss: 1.6402 - val_Recall: 0.7928 - val_loss: 0.8866 - learning_rate: 1.0000e-05 Epoch 3/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 545us/step - Recall: 0.7294 - loss: 1.4604 - val_Recall: 0.8784 - val_loss: 0.8528 - learning_rate: 1.0000e-05 Epoch 4/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 543us/step - Recall: 0.7713 - loss: 1.3327 - val_Recall: 0.8964 - val_loss: 0.8126 - learning_rate: 1.0000e-05 Epoch 5/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 545us/step - Recall: 0.8010 - loss: 1.2428 - val_Recall: 0.9234 - val_loss: 0.7849 - learning_rate: 1.0000e-05 Epoch 6/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 543us/step - Recall: 0.8220 - loss: 1.1750 - val_Recall: 0.9279 - val_loss: 0.7682 - learning_rate: 1.0000e-05 Epoch 7/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 549us/step - Recall: 0.8547 - loss: 1.1040 - val_Recall: 0.9234 - val_loss: 0.7381 - learning_rate: 1.0000e-05 Epoch 8/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 546us/step - Recall: 0.8745 - loss: 1.0874 - val_Recall: 0.9189 - val_loss: 0.7168 - learning_rate: 1.0000e-05 Epoch 9/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 541us/step - Recall: 0.8942 - loss: 1.0094 - val_Recall: 0.9234 - val_loss: 0.7003 - learning_rate: 1.0000e-05 Epoch 10/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 543us/step - Recall: 0.8992 - loss: 0.9889 - val_Recall: 0.9189 - val_loss: 0.6812 - learning_rate: 1.0000e-05 Epoch 11/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - Recall: 0.8972 - loss: 0.9886 - val_Recall: 0.9189 - val_loss: 0.6626 - learning_rate: 1.0000e-05 Epoch 12/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 541us/step - Recall: 0.9100 - loss: 0.9649 - val_Recall: 0.9279 - val_loss: 0.6440 - learning_rate: 1.0000e-05 Epoch 13/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - Recall: 0.9080 - loss: 0.9110 - val_Recall: 0.9279 - val_loss: 0.6228 - learning_rate: 1.0000e-05 Epoch 14/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 540us/step - Recall: 0.9098 - loss: 0.8882 - val_Recall: 0.9279 - val_loss: 0.6125 - learning_rate: 1.0000e-05 Epoch 15/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 543us/step - Recall: 0.8935 - loss: 0.8951 - val_Recall: 0.9279 - val_loss: 0.5838 - learning_rate: 1.0000e-05 Epoch 16/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - Recall: 0.9098 - loss: 0.8827 - val_Recall: 0.9279 - val_loss: 0.5755 - learning_rate: 1.0000e-05 Epoch 17/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 541us/step - Recall: 0.9028 - loss: 0.8858 - val_Recall: 0.9324 - val_loss: 0.5597 - learning_rate: 1.0000e-05 Epoch 18/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - Recall: 0.9283 - loss: 0.8102 - val_Recall: 0.9279 - val_loss: 0.5509 - learning_rate: 1.0000e-05 Epoch 19/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - Recall: 0.8973 - loss: 0.8228 - val_Recall: 0.9189 - val_loss: 0.5329 - learning_rate: 1.0000e-05 Epoch 20/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - Recall: 0.8984 - loss: 0.8636 - val_Recall: 0.9144 - val_loss: 0.5204 - learning_rate: 1.0000e-05 Epoch 21/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 541us/step - Recall: 0.9130 - loss: 0.8144 - val_Recall: 0.9144 - val_loss: 0.5069 - learning_rate: 1.0000e-05 Epoch 22/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 545us/step - Recall: 0.9222 - loss: 0.7693 - val_Recall: 0.9144 - val_loss: 0.4971 - learning_rate: 1.0000e-05 Epoch 23/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 552us/step - Recall: 0.9057 - loss: 0.8046 - val_Recall: 0.9099 - val_loss: 0.4786 - learning_rate: 1.0000e-05 Epoch 24/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 540us/step - Recall: 0.9227 - loss: 0.7581 - val_Recall: 0.9099 - val_loss: 0.4726 - learning_rate: 1.0000e-05 Epoch 25/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - Recall: 0.8978 - loss: 0.7526 - val_Recall: 0.9099 - val_loss: 0.4612 - learning_rate: 1.0000e-05 Epoch 26/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 635us/step - Recall: 0.9036 - loss: 0.7821 - val_Recall: 0.9099 - val_loss: 0.4465 - learning_rate: 1.0000e-05 Epoch 27/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 546us/step - Recall: 0.9120 - loss: 0.7368 - val_Recall: 0.9099 - val_loss: 0.4407 - learning_rate: 1.0000e-05 Epoch 28/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 543us/step - Recall: 0.9103 - loss: 0.7214 - val_Recall: 0.9099 - val_loss: 0.4337 - learning_rate: 1.0000e-05 Epoch 29/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 545us/step - Recall: 0.9047 - loss: 0.7468 - val_Recall: 0.9099 - val_loss: 0.4261 - learning_rate: 1.0000e-05 Epoch 30/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 545us/step - Recall: 0.9009 - loss: 0.7114 - val_Recall: 0.9099 - val_loss: 0.4106 - learning_rate: 1.0000e-05 Epoch 31/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 545us/step - Recall: 0.9277 - loss: 0.6829 - val_Recall: 0.9099 - val_loss: 0.4061 - learning_rate: 1.0000e-05 Epoch 32/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 541us/step - Recall: 0.9071 - loss: 0.6799 - val_Recall: 0.9099 - val_loss: 0.3953 - learning_rate: 1.0000e-05 Epoch 33/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - Recall: 0.9220 - loss: 0.6711 - val_Recall: 0.9099 - val_loss: 0.3922 - learning_rate: 1.0000e-05 Epoch 34/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 544us/step - Recall: 0.9105 - loss: 0.6765 - val_Recall: 0.9099 - val_loss: 0.3731 - learning_rate: 1.0000e-05 Epoch 35/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 544us/step - Recall: 0.9204 - loss: 0.6538 - val_Recall: 0.9099 - val_loss: 0.3761 - learning_rate: 1.0000e-05 Epoch 36/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 543us/step - Recall: 0.9011 - loss: 0.6830 - val_Recall: 0.9099 - val_loss: 0.3614 - learning_rate: 1.0000e-05 Epoch 37/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 544us/step - Recall: 0.8930 - loss: 0.6673 - val_Recall: 0.9099 - val_loss: 0.3528 - learning_rate: 1.0000e-05 Epoch 38/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 543us/step - Recall: 0.8866 - loss: 0.6781 - val_Recall: 0.9099 - val_loss: 0.3472 - learning_rate: 1.0000e-05 Epoch 39/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 543us/step - Recall: 0.9158 - loss: 0.6336 - val_Recall: 0.9099 - val_loss: 0.3439 - learning_rate: 1.0000e-05 Epoch 40/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 552us/step - Recall: 0.9252 - loss: 0.6068 - val_Recall: 0.9054 - val_loss: 0.3332 - learning_rate: 1.0000e-05 Epoch 41/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 544us/step - Recall: 0.8849 - loss: 0.6485 - val_Recall: 0.9009 - val_loss: 0.3243 - learning_rate: 1.0000e-05 Epoch 42/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - Recall: 0.9016 - loss: 0.6071 - val_Recall: 0.9009 - val_loss: 0.3220 - learning_rate: 1.0000e-05 Epoch 43/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - Recall: 0.8907 - loss: 0.6358 - val_Recall: 0.9054 - val_loss: 0.3160 - learning_rate: 1.0000e-05 Epoch 44/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 543us/step - Recall: 0.9070 - loss: 0.6084 - val_Recall: 0.9054 - val_loss: 0.3112 - learning_rate: 1.0000e-05 Epoch 45/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 541us/step - Recall: 0.9004 - loss: 0.6103 - val_Recall: 0.9009 - val_loss: 0.3055 - learning_rate: 1.0000e-05 Epoch 46/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 546us/step - Recall: 0.9021 - loss: 0.6079 - val_Recall: 0.9009 - val_loss: 0.3008 - learning_rate: 1.0000e-05 Epoch 47/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 543us/step - Recall: 0.9112 - loss: 0.5849 - val_Recall: 0.8964 - val_loss: 0.2900 - learning_rate: 1.0000e-05 Epoch 48/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 548us/step - Recall: 0.9019 - loss: 0.5767 - val_Recall: 0.8964 - val_loss: 0.2888 - learning_rate: 1.0000e-05 Epoch 49/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 549us/step - Recall: 0.8854 - loss: 0.6155 - val_Recall: 0.8964 - val_loss: 0.2821 - learning_rate: 1.0000e-05 Epoch 50/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 549us/step - Recall: 0.8937 - loss: 0.5899 - val_Recall: 0.8964 - val_loss: 0.2820 - learning_rate: 1.0000e-05
print("Time taken in seconds ",end-start)
Time taken in seconds 7.67601203918457
plot(history,'loss')
plot(history,'Recall')
model_12_train_perf = model_performance_classification(model_12, x_train, y_train) #threshold = 0.5
print("Model 12 Train Performance")
print(model_12_train_perf)
print("-"*50)
print("Model 12 Validation Performance")
model_12_val_perf = model_performance_classification(model_12,x_val,y_val)
print(model_12_val_perf)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 170us/step Model 12 Train Performance Accuracy Recall Precision F1 Score 0 0.946875 0.929479 0.753247 0.813266 -------------------------------------------------- Model 12 Validation Performance 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 183us/step Accuracy Recall Precision F1 Score 0 0.943 0.921067 0.743089 0.802432
#let's get classification report on y train and y val for Model 12
y_train_pred_12 = model_12.predict(x_train)
y_val_pred_12 = model_12.predict(x_val)
print("Classification Report - Train data Model_12",end="\n\n")
cr_train_model_12 = classification_report(y_train,y_train_pred_12>0.5)
print(cr_train_model_12)
print("Classification Report - Validation data Model_12",end="\n\n")
cr_val_model_12 = classification_report(y_val,y_val_pred_12>0.5)
print(cr_val_model_12)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 168us/step 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 191us/step Classification Report - Train data Model_12 precision recall f1-score support 0 0.99 0.95 0.97 15112 1 0.51 0.91 0.66 888 accuracy 0.95 16000 macro avg 0.75 0.93 0.81 16000 weighted avg 0.97 0.95 0.95 16000 Classification Report - Validation data Model_12 precision recall f1-score support 0 0.99 0.95 0.97 3778 1 0.49 0.90 0.64 222 accuracy 0.94 4000 macro avg 0.74 0.92 0.80 4000 weighted avg 0.97 0.94 0.95 4000
# Predict probabilities using confusion matrix
y_pred = model_12.predict(x_val)
for i in range(len(y_val)):
if y_pred[i]>0.5:
y_pred[i]=1
else:
y_pred[i]=0
cm = confusion_matrix(y_val,y_pred)
labels = ['True Negative','False Positive','False Negative','True Positive']
make_confusion_matrix(cm,group_names=labels,cmap = "Blues")
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 193us/step
Observation¶
- Removing early stopping from Model 12 and using only reduce learning rate increased the recall but precision is still around 50%
- Hence, using the reduce learning rate does not have much significance.
Model 13¶
With 0.3 threshold for prediction to see if the recall increases
#clear current keras session
tf.keras.backend.clear_session()
# initializing the neural network
#chanig the weight initializer back to Xavier or glorot_normal
model_13 = Sequential()
# Adding a dense layer with ReLu activation function, with the input dimension determined by the shape of the training data.
model_13.add(Dense(128, activation = 'relu', input_dim = x_train.shape[1],kernel_initializer='glorot_normal'))
model_13.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_13.add(Dropout(0.3)) # Adding 30% Dropout after first hidden layer
model_13.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_13.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_13.add(Dense(64, activation = 'relu')) #Adding a dense layer with ReLu activation function
model_13.add(BatchNormalization()) # Adding Batch Normalization after Dense Layer
model_13.add(Dense(1,activation = 'sigmoid')) #adding output layer with sigmoid as the output is binary 1 or 0
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/core/dense.py:93: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
model_13.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 128) │ 5,248 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization │ (None, 128) │ 512 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 64) │ 8,256 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_1 │ (None, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 64) │ 4,160 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_2 │ (None, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 1) │ 65 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 18,753 (73.25 KB)
Trainable params: 18,241 (71.25 KB)
Non-trainable params: 512 (2.00 KB)
#Adam Optiomizer
lr = 1e-5
optimizer = keras.optimizers.Adam(learning_rate=lr) # defining SGD as the optimizer to be used
model_13.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["Recall"])
#callbacks to reduce osciallation and overfitting
lr_schedule = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, verbose=1, min_lr=1e-6) #learnign schedule
start = time.time()
history = model_13.fit(x_train, y_train, validation_data=(x_val,y_val) , batch_size=64,
epochs=epochs,class_weight=cw_dict,callbacks=[lr_schedule])
end=time.time()
Epoch 1/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 852us/step - Recall: 0.4774 - loss: 1.7069 - val_Recall: 0.6261 - val_loss: 0.7153 - learning_rate: 1.0000e-05 Epoch 2/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 526us/step - Recall: 0.6269 - loss: 1.4742 - val_Recall: 0.7477 - val_loss: 0.6806 - learning_rate: 1.0000e-05 Epoch 3/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 520us/step - Recall: 0.7155 - loss: 1.3133 - val_Recall: 0.8423 - val_loss: 0.6618 - learning_rate: 1.0000e-05 Epoch 4/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 517us/step - Recall: 0.7981 - loss: 1.2215 - val_Recall: 0.8829 - val_loss: 0.6404 - learning_rate: 1.0000e-05 Epoch 5/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 517us/step - Recall: 0.8406 - loss: 1.1414 - val_Recall: 0.8919 - val_loss: 0.6215 - learning_rate: 1.0000e-05 Epoch 6/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 517us/step - Recall: 0.8630 - loss: 1.0986 - val_Recall: 0.8874 - val_loss: 0.6118 - learning_rate: 1.0000e-05 Epoch 7/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 518us/step - Recall: 0.8772 - loss: 1.0879 - val_Recall: 0.8964 - val_loss: 0.6070 - learning_rate: 1.0000e-05 Epoch 8/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 526us/step - Recall: 0.8908 - loss: 1.0408 - val_Recall: 0.9009 - val_loss: 0.5853 - learning_rate: 1.0000e-05 Epoch 9/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 557us/step - Recall: 0.8891 - loss: 0.9865 - val_Recall: 0.9054 - val_loss: 0.5687 - learning_rate: 1.0000e-05 Epoch 10/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 550us/step - Recall: 0.9045 - loss: 0.9607 - val_Recall: 0.9099 - val_loss: 0.5549 - learning_rate: 1.0000e-05 Epoch 11/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 536us/step - Recall: 0.9035 - loss: 0.9573 - val_Recall: 0.9099 - val_loss: 0.5421 - learning_rate: 1.0000e-05 Epoch 12/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 536us/step - Recall: 0.9292 - loss: 0.9057 - val_Recall: 0.9189 - val_loss: 0.5264 - learning_rate: 1.0000e-05 Epoch 13/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 519us/step - Recall: 0.9237 - loss: 0.9133 - val_Recall: 0.9234 - val_loss: 0.5176 - learning_rate: 1.0000e-05 Epoch 14/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 695us/step - Recall: 0.9306 - loss: 0.8960 - val_Recall: 0.9234 - val_loss: 0.5109 - learning_rate: 1.0000e-05 Epoch 15/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 568us/step - Recall: 0.9188 - loss: 0.8707 - val_Recall: 0.9234 - val_loss: 0.4931 - learning_rate: 1.0000e-05 Epoch 16/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 529us/step - Recall: 0.9215 - loss: 0.8733 - val_Recall: 0.9234 - val_loss: 0.4864 - learning_rate: 1.0000e-05 Epoch 17/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 535us/step - Recall: 0.9230 - loss: 0.8532 - val_Recall: 0.9234 - val_loss: 0.4704 - learning_rate: 1.0000e-05 Epoch 18/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 550us/step - Recall: 0.9171 - loss: 0.8296 - val_Recall: 0.9234 - val_loss: 0.4618 - learning_rate: 1.0000e-05 Epoch 19/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 551us/step - Recall: 0.9249 - loss: 0.8135 - val_Recall: 0.9234 - val_loss: 0.4496 - learning_rate: 1.0000e-05 Epoch 20/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 534us/step - Recall: 0.9237 - loss: 0.8023 - val_Recall: 0.9234 - val_loss: 0.4403 - learning_rate: 1.0000e-05 Epoch 21/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 541us/step - Recall: 0.9267 - loss: 0.7885 - val_Recall: 0.9234 - val_loss: 0.4288 - learning_rate: 1.0000e-05 Epoch 22/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 555us/step - Recall: 0.9045 - loss: 0.8147 - val_Recall: 0.9234 - val_loss: 0.4209 - learning_rate: 1.0000e-05 Epoch 23/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 546us/step - Recall: 0.9162 - loss: 0.7921 - val_Recall: 0.9234 - val_loss: 0.4139 - learning_rate: 1.0000e-05 Epoch 24/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 544us/step - Recall: 0.9151 - loss: 0.7717 - val_Recall: 0.9234 - val_loss: 0.4002 - learning_rate: 1.0000e-05 Epoch 25/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 527us/step - Recall: 0.9080 - loss: 0.7666 - val_Recall: 0.9234 - val_loss: 0.3976 - learning_rate: 1.0000e-05 Epoch 26/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 533us/step - Recall: 0.9177 - loss: 0.7684 - val_Recall: 0.9234 - val_loss: 0.3879 - learning_rate: 1.0000e-05 Epoch 27/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 535us/step - Recall: 0.9217 - loss: 0.7429 - val_Recall: 0.9234 - val_loss: 0.3792 - learning_rate: 1.0000e-05 Epoch 28/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 551us/step - Recall: 0.9063 - loss: 0.7674 - val_Recall: 0.9234 - val_loss: 0.3723 - learning_rate: 1.0000e-05 Epoch 29/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 555us/step - Recall: 0.9156 - loss: 0.7189 - val_Recall: 0.9234 - val_loss: 0.3614 - learning_rate: 1.0000e-05 Epoch 30/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 535us/step - Recall: 0.9078 - loss: 0.7119 - val_Recall: 0.9234 - val_loss: 0.3524 - learning_rate: 1.0000e-05 Epoch 31/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 535us/step - Recall: 0.9098 - loss: 0.7160 - val_Recall: 0.9189 - val_loss: 0.3467 - learning_rate: 1.0000e-05 Epoch 32/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 525us/step - Recall: 0.9059 - loss: 0.7227 - val_Recall: 0.9189 - val_loss: 0.3402 - learning_rate: 1.0000e-05 Epoch 33/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 529us/step - Recall: 0.9140 - loss: 0.6872 - val_Recall: 0.9189 - val_loss: 0.3370 - learning_rate: 1.0000e-05 Epoch 34/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 541us/step - Recall: 0.9094 - loss: 0.6732 - val_Recall: 0.9144 - val_loss: 0.3215 - learning_rate: 1.0000e-05 Epoch 35/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 534us/step - Recall: 0.9057 - loss: 0.6787 - val_Recall: 0.9144 - val_loss: 0.3210 - learning_rate: 1.0000e-05 Epoch 36/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 525us/step - Recall: 0.9198 - loss: 0.6622 - val_Recall: 0.9144 - val_loss: 0.3117 - learning_rate: 1.0000e-05 Epoch 37/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 552us/step - Recall: 0.9118 - loss: 0.6672 - val_Recall: 0.9144 - val_loss: 0.3067 - learning_rate: 1.0000e-05 Epoch 38/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 547us/step - Recall: 0.9036 - loss: 0.6719 - val_Recall: 0.9144 - val_loss: 0.3031 - learning_rate: 1.0000e-05 Epoch 39/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - Recall: 0.9054 - loss: 0.6753 - val_Recall: 0.9144 - val_loss: 0.2977 - learning_rate: 1.0000e-05 Epoch 40/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 668us/step - Recall: 0.9113 - loss: 0.6444 - val_Recall: 0.9009 - val_loss: 0.2906 - learning_rate: 1.0000e-05 Epoch 41/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 539us/step - Recall: 0.9227 - loss: 0.6169 - val_Recall: 0.9054 - val_loss: 0.2877 - learning_rate: 1.0000e-05 Epoch 42/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - Recall: 0.9226 - loss: 0.6123 - val_Recall: 0.9054 - val_loss: 0.2787 - learning_rate: 1.0000e-05 Epoch 43/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 537us/step - Recall: 0.9021 - loss: 0.6388 - val_Recall: 0.9054 - val_loss: 0.2776 - learning_rate: 1.0000e-05 Epoch 44/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 549us/step - Recall: 0.9169 - loss: 0.6237 - val_Recall: 0.9009 - val_loss: 0.2702 - learning_rate: 1.0000e-05 Epoch 45/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 535us/step - Recall: 0.9182 - loss: 0.5998 - val_Recall: 0.9009 - val_loss: 0.2633 - learning_rate: 1.0000e-05 Epoch 46/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 520us/step - Recall: 0.9138 - loss: 0.5913 - val_Recall: 0.9009 - val_loss: 0.2619 - learning_rate: 1.0000e-05 Epoch 47/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 518us/step - Recall: 0.9015 - loss: 0.6107 - val_Recall: 0.9054 - val_loss: 0.2575 - learning_rate: 1.0000e-05 Epoch 48/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 521us/step - Recall: 0.9036 - loss: 0.6250 - val_Recall: 0.9054 - val_loss: 0.2543 - learning_rate: 1.0000e-05 Epoch 49/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 560us/step - Recall: 0.8869 - loss: 0.6151 - val_Recall: 0.8964 - val_loss: 0.2458 - learning_rate: 1.0000e-05 Epoch 50/50 250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 544us/step - Recall: 0.9052 - loss: 0.6044 - val_Recall: 0.9009 - val_loss: 0.2409 - learning_rate: 1.0000e-05
print("Time taken in seconds ",end-start)
Time taken in seconds 7.6471638679504395
plot(history,'loss')
plot(history,'Recall')
model_13_train_perf = model_performance_classification(model_13, x_train, y_train) #threshold = 0.5
print("Model 13 Train Performance")
print(model_13_train_perf)
print("-"*50)
print("Model 13 Validation Performance")
model_13_val_perf = model_performance_classification(model_13,x_val,y_val)
print(model_13_val_perf)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 183us/step Model 13 Train Performance Accuracy Recall Precision F1 Score 0 0.953063 0.932224 0.7704 0.828555 -------------------------------------------------- Model 13 Validation Performance 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 185us/step Accuracy Recall Precision F1 Score 0 0.9515 0.927687 0.765785 0.823603
#let's get classification report on y train and y val for Model 13
y_train_pred_13 = model_13.predict(x_train)
y_val_pred_13 = model_13.predict(x_val)
print("Classification Report - Train data Model_13",end="\n\n")
cr_train_model_13 = classification_report(y_train,y_train_pred_13>0.3)
print(cr_train_model_13)
print("Classification Report - Validation data Model_13",end="\n\n")
cr_val_model_13 = classification_report(y_val,y_val_pred_13>0.3)
print(cr_val_model_13)
500/500 ━━━━━━━━━━━━━━━━━━━━ 0s 188us/step 125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 198us/step Classification Report - Train data Model_13 precision recall f1-score support 0 1.00 0.86 0.92 15112 1 0.28 0.93 0.43 888 accuracy 0.86 16000 macro avg 0.64 0.90 0.68 16000 weighted avg 0.96 0.86 0.90 16000 Classification Report - Validation data Model_13 precision recall f1-score support 0 1.00 0.85 0.92 3778 1 0.26 0.93 0.41 222 accuracy 0.85 4000 macro avg 0.63 0.89 0.66 4000 weighted avg 0.95 0.85 0.89 4000
# Predict probabilities using confusion matrix
y_pred = model_13.predict(x_val)
for i in range(len(y_val)):
if y_pred[i]>0.3:
y_pred[i]=1
else:
y_pred[i]=0
cm = confusion_matrix(y_val,y_pred)
labels = ['True Negative','False Positive','False Negative','True Positive']
make_confusion_matrix(cm,group_names=labels,cmap = "Blues")
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 279us/step
Observation¶
- Doing threhold tunning (updating output prediction model threhold from 0.5 to 0.3 ) , increased the recall to 93% but precision drastically decreased to 27% as per clasification report. Hence , not a significant model
Model Performance Comparison and Final Model Selection¶
Now, in order to select the final model, we will compare the performances of all the models for the training and validation sets.
# training performance comparison
models_train_comp_df = pd.concat(
[
model_0_train_perf.T,
model_1_train_perf.T,
model_2_train_perf.T,
model_3_train_perf.T,
model_4_train_perf.T,
model_5_train_perf.T,
model_6_train_perf.T,
model_7_train_perf.T,
model_8_train_perf.T,
model_9_train_perf.T,
model_10_train_perf.T,
model_11_train_perf.T,
model_12_train_perf.T,
model_13_train_perf.T
],
axis=1,
)
models_train_comp_df.columns = [
"Model 0",
"Model 1",
"Model 2",
"Model 3",
"Model 4",
"Model 5",
"Model 6",
"Model 7",
"Model 8",
"Model 9",
"Model 10",
"Model 11",
"Model 12",
"Model 13",
]
print("Training set performance comparison:")
models_train_comp_df
Training set performance comparison:
| Model 0 | Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | Model 6 | Model 7 | Model 8 | Model 9 | Model 10 | Model 11 | Model 12 | Model 13 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Accuracy | 0.992125 | 0.994437 | 0.992938 | 0.993750 | 0.989812 | 0.991500 | 0.987125 | 0.989313 | 0.971625 | 0.968812 | 0.956187 | 0.844375 | 0.946875 | 0.953063 |
| Recall | 0.934884 | 0.953597 | 0.941144 | 0.948463 | 0.961218 | 0.962642 | 0.966685 | 0.958304 | 0.938341 | 0.938972 | 0.930699 | 0.875747 | 0.929479 | 0.932224 |
| Precision | 0.989206 | 0.992995 | 0.990906 | 0.991422 | 0.943856 | 0.956836 | 0.919863 | 0.941949 | 0.840087 | 0.826937 | 0.780056 | 0.622530 | 0.753247 | 0.770400 |
| F1 Score | 0.960243 | 0.972370 | 0.964526 | 0.968834 | 0.952347 | 0.959718 | 0.941851 | 0.949956 | 0.881904 | 0.873239 | 0.836005 | 0.652296 | 0.813266 | 0.828555 |
Observation¶
Out of all the 13 models, the best few models on training dataset standpoint will be :
- Model 6
- Model 5
- Model 4
# Validation performance comparison
models_val_comp_df = pd.concat(
[
model_0_val_perf.T,
model_1_val_perf.T,
model_2_val_perf.T,
model_3_val_perf.T,
model_4_val_perf.T,
model_5_val_perf.T,
model_6_val_perf.T,
model_7_val_perf.T,
model_8_val_perf.T,
model_9_val_perf.T,
model_10_val_perf.T,
model_11_val_perf.T,
model_12_val_perf.T,
model_13_val_perf.T
],
axis=1,
)
models_val_comp_df.columns = [
"Model 0",
"Model 1",
"Model 2",
"Model 3",
"Model 4",
"Model 5",
"Model 6",
"Model 7",
"Model 8",
"Model 9",
"Model 10",
"Model 11",
"Model 12",
"Model 13"
]
print("Validation set performance comparison:")
models_val_comp_df
Validation set performance comparison:
| Model 0 | Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | Model 6 | Model 7 | Model 8 | Model 9 | Model 10 | Model 11 | Model 12 | Model 13 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Accuracy | 0.990000 | 0.991750 | 0.990250 | 0.991500 | 0.987500 | 0.988250 | 0.982500 | 0.987250 | 0.970000 | 0.964250 | 0.952750 | 0.837000 | 0.943000 | 0.951500 |
| Recall | 0.924749 | 0.934155 | 0.927002 | 0.934023 | 0.942505 | 0.940782 | 0.944098 | 0.940253 | 0.933241 | 0.925957 | 0.919869 | 0.864953 | 0.921067 | 0.927687 |
| Precision | 0.977805 | 0.986035 | 0.978026 | 0.983560 | 0.938786 | 0.946466 | 0.900294 | 0.938395 | 0.833497 | 0.809573 | 0.769410 | 0.616544 | 0.743089 | 0.765785 |
| F1 Score | 0.949515 | 0.958444 | 0.950888 | 0.957281 | 0.940637 | 0.943603 | 0.920884 | 0.939321 | 0.875704 | 0.856777 | 0.824612 | 0.642617 | 0.802432 | 0.823603 |
Choosing Best Model reasoning¶
After comparing validation and training metrics—with recall as our top priority since false negatives are costly - Model 6 emerges as the best Model.
Better Generalization (Smaller Overfitting Gap)
Although Model 6 has slightly lower validation accuracy and precision than Models 4 and 5, it generalizes more consistently.
The recall difference between training and validation is very small (96.7% → 94.4%), indicating minimal overfitting, while Model 4 and 5 show slightly wider gaps and may overfit more.
Optimized for High-Risk Class (Failure Detection)
Precision is slightly lower in Model 6, but the trade-off is deliberate—Model 6 is tuned to maximize recall (fewer false negatives), which is more aligned with the business objective.
Models 4 and 5 have better precision, but at the cost of slightly lower recall—risking more missed failures.
best_model = model_6
Now, let's check the performance of the final model on the test set.
best_model_test_perf = model_performance_classification(best_model,X_test,y_test)
best_model_test_perf
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 294us/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.979 | 0.925524 | 0.888231 | 0.905886 |
y_test_pred_best = best_model.predict(X_test)
cr_test_best_model = classification_report(y_test, y_test_pred_best>0.5) # Checking the classification report of best model on test data.
print(cr_test_best_model)
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 252us/step precision recall f1-score support 0 0.99 0.99 0.99 4718 1 0.78 0.87 0.82 282 accuracy 0.98 5000 macro avg 0.89 0.93 0.91 5000 weighted avg 0.98 0.98 0.98 5000
- Model performance (Recall) on test data recall is 92% which is really great as the model can help to reduce the replacement expenses with huge margin.
Actionable Insights and Recommendations¶
Actionable Insights¶
Reduces Costly Generator Replacements (High Recall) The model correctly identifies over ~92% potential failures, allowing timely repairs. Since repairing a generator is far cheaper than replacing it, the model directly minimizes replacement expenses—saving thousands per unit.
Moderate False‐Alarm Rate (Precision ≃ 88.8 %) Roughly 11 % of the times the model raises a “failure” alert, the turbine is actually healthy—meaning about 600 inspections per 5 000 observations. Inspection costs are low relative to repairs, so this false‐alarm rate is acceptable.
Low Overall Error (Accuracy ≃ 97.9 %) Fewer than 2 % of all turbine‐hours would be misclassified, ensuring that the vast majority of operating time is either correctly left alone (no failure) or proactively serviced.
Balance of Recall vs. Precision By maximizing recall first (to avoid expensive replacements) while keeping precision near 90 %, Model 6 strikes the optimal cost trade-off: most true failures are caught, with a controlled number of “unnecessary” inspections.
Stable Generalization Test‐set F₁ ≃ 0.906 closely mirrors validation F₁ (≈ 0.921), indicating minimal overfitting. This stability suggests ReneWind can trust the model’s performance on new turbine data without aggressive regularization or frequent hyperparameter retuning.
Lowers Downtime and Increases Energy Output By predicting failures before breakdowns occur, turbines can be maintained without long unplanned outages. This leads to higher uptime, more consistent energy generation, and increased revenue from energy production.
Supports Scalable, Predictive Maintenance Strategy Instead of manually reviewing turbine health or reacting to alerts, the ML model automates and scales predictive maintenance across thousands of turbines—reducing monitoring overhead and enabling smarter operations even with a lean workforce.
Business Recommendation¶
Decreases Inventory Holding Costs With better foresight into upcoming part failures, ReneWind can streamline spare-part inventory, purchasing components only when needed. This cuts down storage costs and capital lock-in tied to overstocking.
Optimizes Maintenance Operations and Labor Allocation Failure predictions allow ReneWind to schedule inspections and repairs efficiently, avoiding unnecessary labor during low-risk periods. This ensures better technician utilization and reduces emergency call-out costs.
Enables Financial Forecasting of Failure Risks Aggregated model outputs can be used to estimate future maintenance and replacement costs, helping the finance team with budget planning and investment decisions.